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Foreword 


U.V. Joseph 


Don Bosco Umswai, Assam 


When I was asked to write a Foreword to NEIL Vol 7, I was reluctant at first. It is an honour I 
do not deserve. At the same time, from somewhere within that part of me that has been so 
much affected by contact with some of the numerous tribes, cultures and languages of the 
Northeast a whispering voice kept encouraging me. That voice could not be stilled. I heed that 
voice more to thank the fascinating array of the tribes, cultures and languages of Northeast in 
general and NEILS in particular, rather than to say anything more substantial. 

And as I sat to write, my thoughts sprang and rewound to the past. After more than 
four hours of head-spinning swinging and swirling through the tight curves of the old 
Guwahati-Shillong road, and, of course, the customary tea break at Nongpoh, we (six of us 
including a guide) reached Shillong Bus Station on Jail Road, a few feet below Khyndailaid 
(Police Bazar). It was the evening of 17" May, 1976. It was raining; the city roads were slick 
and they glistened in the evening lights. We had arrived in Guwahati that afternoon, having 
travelled from the southern State of Kerala, changing trains at Chennai (then Madras), 
Kolkata (then Calcutta) and finally scrambling onto the meter-guage train at New 
Bongaigaon. This meter-guage train had no reservation at all; everyone just ran and climbed 
onto it, trying to grab a seat. 

My companions and myself had just appeared for our 10 Standard examinations 
towards the end of March that year, and the results would be known only in June. We were 
just 15- or 16-year-old boys. Eventually my four companions returned to Kerala, while I have 
been in the Northeast all along, except for two years at Yercaud (in Tamil Nadu) and two 
years at Pune (in Maharashtra). To be more precise, my Northeastern years so far have been 
only in the States of Meghalaya (in Shillong, Umran and Nongpoh in the Khasi Hills and in 
Rongjeng in the Garo Hills) and Assam (in Guwahati, Damra, Umswai and Sojong). 

In all these places it is always the different tribes, cultures and their languages that 
fascinated me. In the initial days it felt so completely different from my home state Kerala, 
where one never even thought of any other language other than Malayalam in the normal 
social situation, and some English and Hindi in school. (The Kerala situation has changed 
much in the last decade!) 

The multilingual multiethnic situation of Northeast India is still a part of its great glory 
as it was forty years ago; however, a keen observer would not fail to notice that this same area 
of elegance and beauty of the Northeast is also an area of concern today. It will be naive not 
to concede that Northeast India is presently at a historic juncture of demographic and 
linguistic reshuffling and realignment that is unprecedented in its known history. Several 
agents converge into a massive force that tears apart the cultural and linguistic boundaries of 
many a linguistic and ethnic entity of Northeast India. The most prominent among them are 
mass media (print as well as electronic), trade and commerce rendered far more easy and 
necessary today than in the past, and the spread of modern system of education. 

The modern educational system with its emphasis on English as the medium of 
instruction is foremost among the three agents mentioned above. English symbolizes 
educational and economic success as well as upward social mobility for individuals and for 
communities. In the great rush for social and educational advancement, quite akin to the gold 
rush of yesteryears, everyone wants English medium education for their community and for 
their children. There is a common belief that the earlier a student begins to learn English, the 
better s/he is able to master that language. Accordingly English medium Primary Schools (and 
even English medium pre-primary schools) are a common feature even in the remotest of 
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villages of Northeast India. The multilingual multiethnic demographic composition of the 
region makes the choice of English appear the only option available. 

The massive and incontrovertible evidence that putting a child through the Primary 
School with a medium of instruction other than its mother tongue is harmful to the concept- 
forming ability and general mental development of the child falls on deaf years. The 
educational myth that it is best to send a child to an English-medium school right from the 
nursery class is swallowed hook, line and sinker all over India, and with greater ease in 
Northeast India. 

I saw the debilitating effect of such a system at close quarters when I was principal of 
Don Bosco Umswai (Karbi Anglong) that had then, and still has, Tiwas and Karbis on its 
rolls. Fortunately the teachers as well as the parents of the students recognized the problem. 
Hence we were able to make some effort to have mother tongue education for Tiwa and Karbi 
children in that school. With the help of two separate teams, one Tiwa and the other Karbi, we 
developed textbooks (nearly 50 books) for all the subjects and for all the classes up to the 
fourth Standard. Gradually, in place of the English-medium Primary Classes the school had 
Tiwa and Karbi medium Primary classes: two schools under the same roof! There was a 
quantum leap in understanding and assimilation on the part of the students. 

The system was taken up by other schools in Karbi Anglong: in Amkachi, Sojong, 
Karbi Rongsopi, Arnam-Longthu (Baithalangso), Rongkiri, Borkok, Mynser and in a few 
other schools for Karbi, and in Thawlaw, Tipali and Simlikhunji for Tiwa. For various 
reasons several of these schools have rolled back the experiment; Don Bosco School Umswai 
itself shut down its Karbi medium section, while the Tiwa medium section still continues, 
much to the benefit of the Tiwa students. How I wish such efforts were sustained and 
promoted! 

The Umswai mother tongue venture was initiated in 2006, the year of the first NEILS 
conference. It is the colourful linguistic scenario of Northeast India that makes Northeastern 
linguistics attractive and rewarding. Although linguistics looks at languages at a different 
level and from a different angle, through reverse osmosis languages and communities get 
highlighted and archived, and in some cases protected and preserved. Thus NEILS and the 
many linguists, foreign and Indian, who work on the different languages of the Northeast do 
great honour to the wealth of the Northeast's linguistic diversity. The life-long enthusiasm of 
pioneers like Robbins Burling has a rub-on effect on younger linguists. (I recall the 
excitement I had when Robbins Burling and I made several field-trips to Diyungbra (North 
Cachar Hills) from Umswai (Karbi Anglong) in an effort to understand and learn Dimasa.) 
Stephen Morey, Mark Post and Jyotiprakash Tamuli, who conceived and gave shape to 
NEILS, and who continue to organize the biennial conferences with the assistance of many 
others, deserve much praise. The editors of the NEILS volumes do a great deal of selfless 
work and spend much time on the various papers. Editors of the present volume Linda 
Konnerth, Stephen Morey, Priyankoo Sarmah and Amos Teo must surely have spent many 
long hours on the work and must have sent dozens of emails back and forth to coax, cajole, 
remind, and encourage several people in order to keep the entire process moving along. The 
result is the present volume. Thanks to them! 

May NEILS and Northeastern linguistics grow from strength to strength, and may the 
rich tapestry of Northeastern languages grow and flourish in all their variegated beauty. 


A Note from the Editors 


This seventh volume of North East Indian Linguistics appears on the tenth anniversary of the 
North East Indian Linguistics Society (NEILS) conferences. Formed in 2005 by Jyotiprakash 
Tamuli (Gauhati University) and Mark Post (then at La Trobe University, Australia), and 
subsequently joined by Stephen Morey (also from La Trobe University) the North East Indian 
Linguistics Society had its inaugural international conference in February 2006 at Gauhati 
University. Selected papers from the NEILS conferences have been appearing in the North 
East Indian Linguistics volumes since. It is our great pleasure that with NEIL7, a total of 101 
papers have appeared, all peer-reviewed by leading international specialists in the relevant 
subfields. The NEIL articles are testament to the vibrant and growing researcher community 
of North East Indian scholars working on languages of their own region as well as 
international scholars from Europe, Asia, and North America. While the core group of 
Jyotiprakash Tamuli, Mark Post, and Stephen Morey continue to pull NEILS forward by 
being involved in the organization of the conferences and the editing of the proceedings 
publications, some new faces have joined in to help coordinate NEIL activities in an effort to 
make the NEILS scholarly network and academic exchange grow larger and stronger. There is 
no doubt that the enthusiasm and dedication of the initial three people behind NEILS has been 
contagious and is the main reason for the success of NEILS. It is for the documentation and 
scientific investigation of the fascinating languages of the North East and the warm and 
hospitable diverse peoples that speak them that NEILS was formed, and this spirit has been 
with NEILS throughout the years. 

The papers for this anniversary volume were initially presented at the seventh and 
eighth meetings of the North East Indian Linguistics Society, held in Guwahati, India, in 2012 
and 2014. As with previous conferences, these meetings were held at the Don Bosco Institute 
in Guwahati, Assam, and hosted in collaboration with Gauhati University. This volume 
includes sixteen contributions. Exactly half of them are authored by North East Indian 
scholars, while the other half are authored by an international researcher community from 
France, Japan, Russia, Switzerland, and USA. The languages discussed in these contributions 
come from the Tibeto-Burman, Austroasiatic, Indo-Aryan, and Tai-Kadai language families 
and thus represent the full phylogenetic diversity of the North East. 

Our section on phonology begins with three papers that offer overview descriptions of 
phonological systems. Longmailai and Cing provide a description of the Dimasa and Tedim 
Chin phonologies in a comparative perspective, surveying both segmental and suprasegmental 
features. Similarly, Kondakov's contribution discusses the phonology of Harigaya Koch, the 
Koch variety used as a lingua franca among speakers of this underdescribed Bodo-Garo 
language. Kondakov has added a very useful appendix of a large number of carefully 
transcribed Harigaya Koch words to his contribution. Furthermore, Veikho and Khyriem give 
an overview of Poula phonetics and phonology, a language of Nagaland and Manipur that has 
remained almost unknown to the linguistic world. The two remaining papers in our phonology 
section have more specific phonetic goals. Meyase investigates the “fifth tone" in Tenyidie 
(previously known as Angami), while Horo and Sarmah study the acoustic properties of 
vowels in Assam Sora. 

The morphosyntax section contains five papers. First we have a study of the 
differential marking of arguments in the Austroasiatic War language by Daladier. The author 
argues that this type of differential marking highlights core arguments for agentive and 
beneficiary roles and crucially refers to the subjectivity of interlocutors or animate core 
arguments. From Austroasiatic, we move to Indo-Aryan with a contribution by Handique and 
K. Dutta on the morphophonemics of verbs in Kamrupi Assamese, considering both the 
paradigmatic and syntagmatic perspectives. Moving on to Tibeto-Burman morphosyntax, the 
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first paper is by Otsuka on the paradigm of person marking in the Asho Chin verb. This 
Southern Chin language exhibits a reduced but interestingly innovated system of person 
marking in comparison with its more conservative Northwestern (formerly ‘Old Kuki’) and 
Northeastern Chin (formerly *Northern Chin") cousins. The next paper by N. Dutta and Wary 
briefly examines a subset of the interesting so-called Za constructions in Boro. The 
constructions discussed by the authors have passive-like properties, but can still be used with 
intransitive verbs and in imperatives. Concluding the morphosyntax section, Dey gives an 
overview of kinship terminology in Hrusso, a language of Arunachal Pradesh whose 
phylogenetic affiliation remains controversial. 

Next is a small special section on numeral systems. Barbora, Acharya and Wangno 
compare the numerals of the Bugun, Deuri and Nocte languages of northern Assam and 
Arunachal Pradesh. Following this we have a second contribution by Daladier, in which she 
discusses the origins of numbers, number words and cardinals in the Austroasiatic Pnar, War, 
Khasi, and Lyngam languages, arguing that they go back to ‘grouping units’ of counting 
particular items for particular purposes. 

The final section of this volume addresses topics in historical linguistics as well as 
philology. DeLancey surveys different postverbal negative markers in Kuki-Chin languages 
and argues that the origins of some of them lie in their reconstructable position of getting 
prefixed onto auxiliaries or copulas, which in turn were following the lexical verb. Next we 
have a paper by Macario discussing the phylogenetic position of Apatani within Tibeto- 
Burman. Staying in Arunachal Pradesh, Lieberherr's sizable and very important contribution 
discusses new data collected from two hitherto unknown dialects of Puroik. In the first part of 
his paper, Lieberherr proposes a reconstruction of Proto-Puroik. In a second part, he compares 
the reconstructed forms to Kuki-Chin with the goal to investigate the affiliation of Puroik with 
Tibeto-Burman, and proposes that based on a considerable number of proposed cognates, we 
should assume that Puroik is indeed a Tibeto-Burman language. Finally, the last contribution 
to this volume is Gogoi’s methodology paper for studying Ahom manuscript. 

This book represents the second volume published with Asia-Pacific Open Access. 
North East Indian Linguistics thus continues to be available for free download and easily 
accessible to everyone. We wish to thank everyone who helped bring this book into fruition, 
including the authors and peer reviewers, as well as our publisher. 


Linda Konnerth 
Liwachangning, Chandel, Manipur, India 


Stephen Morey 
Melbourne, Australia 


Priyankoo Sarmah 
Guwahati, India 


Amos Teo 
Paris, France 
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1. Some phonological features of Dimasa and Tedim Chin 


1. Some phonological features of Dimasa and Tedim Chin! 


Monali Longmailai and Zam Ngaih Cing 
North-Eastern Hill University, Shillong 


Abstract This paper discusses the phonological typology of two related languages from the Tibeto-Burman 
language family, Dimasa and Tedim Chin. Dimasa is a Bodo-Garo language while Tedim Chin belongs 
to the Kuki-Chin group of languages. It introduces the segmental inventories in both Dimasa and Tedim 
Chin. It describes phonotactics, syllable structure and tone in these two languages. It discusses the 
phonological processes present in these languages such as, vowel length, deletion, gemination, 
assimilation, dissimilation, metathesis and glottalisation. The paper compares the two languages and 
brings out the phonological similarities and dissimilarities throughout the findings. 


Citation Longmailai, Monali and Zam Ngaih Cing. 2015. Some phonological features of Dimasa and Tedim Chin. North East 
Indian Linguistics 7, 15-28. Canberra, Australian National University: Asia-Pacific Linguistics Open Access. 


Volume Editors Linda Konnerth, Stephen Morey, Priyankoo Sarmah, Amos Teo 
Copyright © 2015, the author(s), release under Creative Commons Attribution license 
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1. Introduction 


Dimasa and Tedim Chin belong to the two sub-branches, Bodo-Garo and Kuki-Chin of the 
greater Tibeto-Burman language family? Dimasa is spoken mainly in Assam and in Dimapur 
district in Nagaland. Tedim Chin is spoken in Churachandpur and in Chandel districts in 
Manipur, some parts of Champhai district in Mizoram and in the northern part of the 
neighbouring Myanmar. According to 2001 census, Dimasa population is 65,000 in Dima 
Hasao and 40,000 in Karbi Anglong districts in Assam. Tedim Chin population is 344,000 in 
India and Myanmar according to Grimes (1996). The data used in the paper are from Hasao, 
the standard dialect of Dimasa and from Tedim, which is the standard dialect of Tedim Chin? 
The native speakers nowadays prefer to use *Tedim' as they consider it to be more native and 
appropriate rather than *Tiddim', which was in use in the British records (Cing 2012: 169). 
Both the languages are tonal with SOV word order. Dimasa is mostly agglutinating while 
Tedim Chin is partly agglutinating and isolating. These genetically-related languages (cousin 
languages) are distantly cognate and they share some phonological features and processes. We 
will determine what changes may have occurred in these two languages that developed from 
the proto-language, which is Proto-Tibeto-Burman. 

This is perhaps the first attempt to cross-linguistically analyse the phonological features of 
Dimasa and Tedim Chin. The paper, therefore, aims to investigate the phonological typology 
in both the languages, being sub-groups of the Tibeto-Burman language family. It also seeks 
to find out the degree of similarities and dissimilarities in their phonological features. In this 
paper, we will introduce their segmental inventories in $2. In $3, we will highlight the tonal 
features and tone sandhi of these two languages. Finally in $4, we will discuss some of the 
phonological processes found in these languages. Figures 1 and 2 illustrate the areal 
distributions of the Dimasa and Tedim Chin languages. 


! We would like to acknowledge the reviewers of the paper for their comments and suggestion. 

? See Lewis, Simons and Fennig (2015). The Ethnologue ISO code of Dimasa is 639-2: dis and Tedim Chin is 
639-3: ctd. The alternate names listed in the code for Dimasa are Dimasa Kachari, Hills Kachari and for Tedim 
Chin are Tedim and Tiddim. 

? The authors, native speakers of Dimasa and Tedim Chin, have provided the elicited data for the phonological 
analysis in this paper. 
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NORTH EASTERN INDIA 


Dimasa 
speaking 
region 


Tedim 
Chin 
speaking 
region 


NATIONAL ATLAS & THEMATIC MAPPING ORGANISATION. * 


Figure 1: Areal distribution of Dimasa in Assam and 
Tedim Chin in Manipur and adjoining areas 


Figure 2 shows their genetic classifications (modifying Lewis, Fennig and Simons 2015). 


| 
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| Bodo-Garo-Northern Naga | Dhimalish | Jingpho-Luish | Kuki-Chin-Naga 
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Riang, Tippera, Tiwa, Usoi (Chothe, Kharam, Moyon), Purum, 
Ralte. Simte. Vainhei. Zo 


Figure 2: Genetic Classification of Dimasa and Tedim Chin 
based on the SIL Ethnologue (2015) 


Aimol, Anal, Biete, Chin (Paite, Siyin, 
Tedim, Thado), Chiru,  Gangte, 
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1. Some phonological features of Dimasa and Tedim Chin 


2. Segmental inventories 

In this section, we will attempt to identify the basic vowel and consonant inventories of 
Dimasa and Tedim Chin in $2.1 and 82.2. We will also find out their formations of consonant 
clusters in $2.3 and the syllable structures in $2.4. 

2.1. Vowels 


Dimasa has a simple vowel system consisting of 6 short vowels as presented in Table 1. 


Table 1: Vowels in Dimasa 


Unrounded | Front | Central | Back | Rounded 
High i u | Close 
Mid e ə o | Mid Open 
Low a Open 


A minimal set of vowel phonemes in Dimasa is shown here in Table 2. 


Table 2: Minimal set of Dimasa vowels 


buma ‘mother’ | K^ro? ‘head’ ga da ‘step first’ 
bumu ‘name’ Vn ‘be fast’ gəda ‘ancient’ 
kre ‘slowly’ 


The sixth vowel /ə/ is a basic vowel phoneme and it is not an allophone of any other 
vowel. It is the least common vowel in Dimasa and /a/ is the most common vowel. Except /o/ 
which occurs only word medially, the rest of the vowels occur in all three word-positions. 

Tedim Chin has a simple vowel system. It does not have the sixth vowel /o/ which is 
present in Dimasa. Table 3 illustrates the vowels of Tedim Chin. 


Table 3: Vowels in Tedim Chin 


Unrounded | Front | Central | Back | Rounded 
High I u | Close 
Mid e ə | Mid Open 
Low a Open 


A minimal set of vowel phonemes in Tedim Chin is illustrated in Table 4. 


Table 4: Minimal set of Tedim Chin vowels 


tay ‘straight’ sam ‘hair’ 
tey ‘select’ sim ‘count’ 
tan ‘cubit sum ‘money’ 
tuj ‘above’ 


All the vowels in Tedim Chin occur in all the three positions in a word. /»/ occurs very 
rarely in the word final position. 
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Dimasa has 6 diphthongs, with the main vowel followed by the glides /i/ and /u/. /ai/ and 
/au/ are the most common diphthongs.^ The language does not have triphthongs. Table 5 


shows the possible diphthongs in Dimasa. 


Table 5: Dimasa Diphthongs 


VG Dimasa Gloss 
ai pai ‘come’ 
ei olei ‘similarly’ 
oi foi ‘tolerate’ 
ou houra ‘over there’ 
au dau? ‘stab’ 
ui hui ‘hello’ 


Tedim Chin has 10 diphthongs (ar, au, er, eu, 1a, ru, or, ou, ua, ur) and 4 triphthongs (rar, 
1au, uar, uau). The vowels /1/ and /u/ are phonetically realised as glides in the diphthongs and 
triphthongs. Table 6 presents the Tedim Chin diphthongs and triphthongs. 


Table 6: Tedim Chin Diphthongs and Triphthongs 


VG GV GVG 
ai | par ‘go’ ua | xua ‘village’ | rar | hiaihiai' decrease? 
au | vau ‘threaten’ | ur | lui ‘river’ tau | liau ‘pay fine’ 
ei | let ‘tongue’ ia | kia ‘fall’ uar | xuai ‘bee’ 
eu | keu ‘dry’ uau | huau ‘whisper’ 


or | tor ‘narrow’ 
ou | mou ‘bride’ 
m | kiu ‘elbow’ 


Nasal vowels are not present in Dimasa and Tedim Chin although the vowels can get 
nasalized in Dimasa due to regressive assimilation as in /ridgzamp"am/ > /rrdgamp"ái/ ‘Dimasa 
traditional dress for women’. 


2.2. Consonants 


Dimasa has 17 phonemic consonants and Tedim Chin has 19 phonemic consonants.? Tedim 
Chin does not have allophones while Dimasa has allophones. Table 7 presents a phonetic 


chart of Dimasa consonants with the allophones shown in brackets and Table 8 presents 
Tedim Chin phonemic consonants. 


^ Jacquesson (2008) mentions only two diphthongs in his data /ai/ and /au/ while Singha (2001) does not list any 
diphthong. Sarmah (2009) has listed eight diphthongs in his work. 

7 Singha (2001) and Jacquesson (2008) mention that Dimasa has 16 consonants. However, in the co-author 
(Longmailai)'s data (Hasao dialect), Dimasa has 17 consonants including /?/ which is not mentioned in the 
previous works. 
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Table 7: Phonetic Consonants in Dimasa 


Labial Labio- Dental Alveolar Post Palatal Velar Glottal 
dental Alveolar 
Stop p? b td k^ g ? 
Nasal m n 0 
Fricative [f] [s] [z] J h 
Affricate [sle] | [f] ds 
Tap f 
Semi-vowel | w j 
Lateral 1 


Among the stops in Dimasa, /p^/ and /k"/ occur in all the three word environments. /t^/ and 
/d/ occur initially and medially in a word. The voiced unaspirated stops, /b/ and /g/, occur 
word initially and medially and they occur as devoiced stops in the final word position. The 
voiceless glottal stop /?/ occurs only word finally in Dimasa. 


Table 8: Phonemic Consonants in Tedim Chin 


Labial Labio- Alveolar Palatal Velar Glottal 
dental 
Stop p p b tt d c kg ? 
Nasal m n n 
Fricative V SZ x h 
Lateral l 


All the stops in Tedim Chin do not occur in all the initial, medial and final environments 
which, is a similar case in Dimasa. It does not have voiced aspirated stops in the three 
environments. The glottal stop /?/ occurs medially and finally in Tedim Chin. 

Minimal sets for stops in Dimasa and Tedim Chin are shown in Table 9. 


Table 9: Minimal sets (stops) 


Dimasa Tedim Chin 
p'a ‘attach’ pat ‘cotton’ 
ba ‘carry (back)' p'at ‘praise’ 
bat 'owe' 
t'a ‘let’s go’ tat ‘thread that is cut off? 
t'a? ‘tuber’ tat ‘kill’ 
tek ‘old’ 
te? ‘measure’ 
da ‘don’t’ dat ‘residue left after 
drinking tea in a tea cup’ 
ga ‘step (v)’ gat ‘weaved’ 
k'a ‘tie v)’ 
k'a? ‘heart’ 


Dimasa nasal consonants /m/ and /n/ occur word initially, medially and finally, although 
/n/ does not occur word initially. In contrast to Dimasa, Tedim Chin has /n/ besides /m/ and 
/n/ occurring in all the three positions. Minimal sets for nasals are shown in Table 10. 
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Table 10: Minimal sets (nasals) 


Dimasa Tedim Chin 
mai ‘rice’ mai ‘face’ 
ndi ‘see’ nai ‘silk’ 
k'rag ‘raise a child’ nai ‘love’ 
k^ram ‘drum’ 


Fricatives and affricates in Dimasa do not occur word-finally. [s] and [tf^] are allophones 
of /f/. [s] occurs in Dimasa in the formation of consonant cluster with Zei [tf^] is an allophonic 
variation in few words like fa ‘tea’, ‘son’, anf’a ‘child’ and arar ‘Kachari’. [z] is an 
allophonic variation of /dz/ only in a consonant cluster or when there is a sixth vowel /o/ in 
between the two consonant sounds. It does not occur word initially and word finally. The 
dental affricates [ts] and [dz] are allophones of /t^/ and /d/ only when followed by /i/ in 
Dimasa. [f] is a free allophonic variation of /p^/. Minimal sets for the phonetic fricatives, 
affricates and their allophones in Dimasa, and phonemic fricatives and stops in Tedim Chin 
are shown in Table 11. 


Table 11: Minimal sets (stops, fricatives and affricates) 


Dimasa Tedim Chin 
gfu ‘white’ sun ‘prick’ 
gfu ‘impure’ zun ‘urine’ 
Jam ‘day’ van ‘sky’ 
dain ‘far’ han ‘grave’ 


ham ‘closeness with relatives’ | xan ‘storey’ 


p’anfla ‘traditional gate’ cm ‘nail’ 
fanfla ‘traditional gate’ zin ‘travel’ 


Jra ‘leftover’ 
sra ‘leftover’ 


bday ‘younger sibling’ 
bzay ‘younger sibling’ 


fa ‘tea’ 
du ‘tea’ 


t^ ‘blood’ 
fi ‘blood’ 


dr ‘water’ 
dr ‘water’ 


Free variation in Tedim Chin fricatives and affricates do not occur word finally. There is 
the presence of the voiceless velar fricative /x/ in Tedim Chin which occurs in the initial and 
medial positions. /c/ is followed by /1/ and /ra/ in Tedim Chin. 
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Among approximants, /r/ and /l/ are found in Dimasa while only /l/ is present in Tedim 
Chin. /l/ does not occur word finally while /r/ occurs in all the three positions in Dimasa. 
Contrary to Dimasa /l/, it occurs in all the three positions in Tedim Chin. 

Dimasa has semi-vowels /w/ and /j/ occurring only word initially and medially, but not 
word finally. Tedim Chin seems to have these semi vowels but it is still not clear whether to 
identify them as distinct phonemes. Table 12 shows the minimal pairs for the central and 
lateral approximants. 


Table 12: Minimal pairs (approximants) 


Dimasa Tedim Chin 
ra ‘cut (vegetables)’ lim ‘delicious’ 
la ‘take’ lum ‘sleep’ 


wa? ‘bamboo’ 
ja? ‘sign of pain’ 


2.3. Phonotactics 


Dimasa consonant clusters occur in the initial word position while Tedim Chin consonant 
clusters occur more medially than finally in a word. In Dimasa, they do not occur word finally 
unlike Tedim Chin. In contrast to Dimasa, Tedim Chin consonant clusters do not occur word 
initially. 

Dimasa is highly productive in the formation of consonant clusters while Tedim Chin is 
less productive in this case. In Dimasa, stops, nasals, sibilants and liquids form consonant 
clusters as shown in Table 13. 


Table 13: Consonant Clusters in Dimasa 


Stop + Stop bda ‘elder brother’ 

Stop + Nasal k"ma ‘lose something’ 

Nasal + Stop mt^au *stop' 

Nasal + Nasal mnay ‘earlier’ 

Stop + Liquid p'ra ‘quick’ 

Liquid + Stop rda ‘vein’ 

Liquid + Sibilant | rza ‘be heavy’ 

Liquid + Nasal rmai ‘embroidery on Dimasa 
women's traditional dress’ 

Nasal + Liquid mram ‘bad smell’ 

Sibilant + Liquid. | /lai ‘tongue’ 

Nasal + Sibilant | mio ‘awake somebody’ 

Sibilant + Nasal | (mon ‘move somebody’ 

Stop + Sibilant bfa ‘son’ 

Sibilant + Stop JSgau ‘remove’ 


The sixth vowel /2/ can be removed in fast speech when it occurs between two consonant 
sounds in some words as in bada > bda ‘elder brother’, baig > bfa ‘son’ and /gau > /gau 
‘remove’. Triple consonant clusters occur very rarely in Dimasa. 
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In Burling (2013), he mentions two words having triple consonant cluster due to vowel 
deletion between the stop and the sibilant, which are p*/kau ‘remove the outer cover of a betel 
nut' and gfgau 'scratch oneself'. In addition to this, the consonant clusters in this language are 
becoming more productive due to vowel deletion and shortening of vowels. 

In Tedim Chin, consonant clusters are formed from Stops, Nasals and Liquids word- 
medially as shown in Table 14. 


Table 14: Consonant Clusters in Tedim Chin 


Stop + Stop nartay ‘banana’ 
Stop + Nasal zarneu ‘cat’ 
Nasal + Stop sangam ‘sibling’ 
Liquid + Stop p'alb: ‘winter’ 
Stop + Sibilant miks: ‘ant? 
Nasal + Sibilant | samsi? ‘comb (ny 
Liquid + Sibilant | gilzay ‘intestine’ 
Nasal + Nasal braymul ‘beard’ 
Liquid + Liquid | guallel ‘defeated’ 


In both Dimasa and Tedim Chin, formation of clusters with stops and nasals are highly 
productive. Triple consonant cluster is not attested in Tedim Chin though in Dimasa, it occurs 
very rarely. Cluster formation in Tedim Chin is not possible in the word initial position. In the 
word final position, consonant cluster occurs with the combination of the lateral /l/ and the 
glottal stop /?/ as in gel? ‘write’ in Tedim Chin. 


2.4. Syllable Structure 
Dimasa syllable structure varies from monosyllabic to quadrisyllabic. It is mostly 
monosyllabic and disyllabic. Table 15 shows the syllable structure in Dimasa, ranging from 


monosyllable, disyllable, trisyllable and quadrisyllable. 


Table 15: Dimasa Syllable Structure 


Syllable Structure Dimasa Gloss 
CV di ‘water’ 
VC.VV amai *mother? 
CVC.CV.CV bandola *windower' 
CVV.CV.CVC.CV maimut'arba ‘funeral’ 


Tedim Chin also has a similar syllable structure like Dimasa with mostly monosyllabic 
and disyllabic words. Table 16 shows the possible syllable structure in Tedim Chin. 


Table 16: Tedim Chin Syllable Structure 


Syllable Structure Tedim Chin Gloss 
CV nu ‘mother’ 

VV.CV aisa ‘crab’ 
CVV.CV.CVC mematum *boil(z) 
VC.CV.CVC.CVV aksmelkar ‘comet’ 
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Polysyllables are derived words in both Dimasa and Tedim Chin which are mainly in 
numerals and phrasal words. Both the languages have open and closed syllables as seen in the 
Tables 15 and 16. 

Consonant clusters in these monosyllables are found initially in Dimasa which does not 
happen for Tedim Chin. The basic syllable pattern for it is (C) V (C) where "V" is the 
obligatory element. 


3. Tones 

Dimasa and Tedim Chin are tonal languages. Dimasa has register tones which are only lexical 
and Tedim Chin has register tones which are both lexical and grammatical in nature. $3.1 will 
discuss the lexical tone in both the languages. $3.2 will discuss the grammatical tone in 
Tedim Chin. Finally, $3.3 will show tone sandhi in these two languages. 


3.1. Lexical Tone 


Dimasa has a simple tone system. It contrasts tone in three ways, high (^), mid (^) and low (`) 
which are register tones. A minimal triplet for Dimasa tone is shown in (1). 


(1) Dad ‘heart’ 
ka ‘bitter’ 
kta ‘tie’ 


In Dimasa, low tones can be long as k’à ‘tie’ and it can be short and glottal nd? ‘house’. 
Mid tones are neither long nor short but they tend to have creakiness as in gfu ‘white’. High 
tones are mostly glottal as mjá? ‘boy’. Low tones are longer than high tones while high tones 
are shorter and more glottal than low tones. Few low tones are short and glottal.° 

Tedim Chin has moderately complex tone system. It has three contrastive tones namely 
high (^), mid (^) and low (^). A segmental triplet of Tedim Chin tonemes is illustrated here." 


(2) xa ‘bitter’ 
xa ‘soul’ 
xà ‘moon’ 


In Tedim Chin, a low short tone occurs wherever it ends with the glottal stop. This feature 
is predictable (Thang 2001) in words as in sà? ‘thick’, xù? ‘cough’and tè? ‘measure’, which is 
similar to the low short tone with the glottal ending in Dimasa. There are also few words in 
Tedim Chin with an exceptional high tone such as xék ‘dwarf and ak ‘new’. This high tone 
is short but not glottal as opposed to Dimasa short high tone. 


3.2. Grammatical Tone 


Grammatical tone is attested in Tedim Chin and some other languages like Thadou among the 
Kuki-Chin languages (Hyman 2007).* In Tedim Chin, the grammatical tone marks possession 
as illustrated in (3)-(5). 


$ Jacquesson (2008) has given 2 tones in Dimasa- high and low. Sarmah (2009) has 3 tones- high, level and low. 
However, he has not classified high tone as long high or high glottal and low tone as low long or low short. 
Burling (2009) has 4 tones based on the glottal stop- (a) low, not stopped, (b) high, not stopped, (c) low, stopped 
and (d) high, stopped. 

7 Accent marking is used to represent the register tones in Tedim Chin. In (2), xa ‘bitter’ has high tone, xa ‘soul’ 
has mid tone and xa *moon' has a low tone. 
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(3) cignoóu *Cingno (A proper name)’ 


(4) *emnou ` làibu 
Cingno ` book 
*Cingno's book’ 


(5) cmnóú làibu 
Cingno ` book 
*Cingno's book’ 


The proper name *Cingno' when marked for possession, the tone in the second syllable as 
seen in (3) rises abruptly to high tone as shown in (5). This shows that tone has a grammatical 
function in Tedim Chin which is to mark possession. 


3.3. Tone Sandhi 


A tone in Dimasa is expressed in two ways- 1) in isolation and 2) sentence finally. But when 
the word containing a high or low tone is suffixed in a sentence, it loses its tonality and 
becomes mostly mid tone. The word Join d ‘say, tell’ for example, has the high short tone in 
the second syllable in isolation. But it becomes mid tone in (6) as (ëmt. when it is followed 
by the past tense suffix -k"a. However, its tone shifts to the following suffix -hå? as in /aint't 
ha? ‘say it over’ as seen in (6). 


(6) | wainfodaü gràü lait ` fam-th-k'a 
wainshodau word CLS-one ask-say-PFV 
*Wainshodau said over something’. 


In Tedim Chin, tone sandhi occurs in compounding processes in two ways. Firstly, the 
mid tone of the root word nz ‘sun’ shifts to becoming high when compounded with another 
root word sã ‘hot’ having the same mid tone as illustrated in (7). Secondly, in (8), the mid 
tone in x3/ ‘stop’ becomes low tone when it is compounded with the root word mun ‘place’ 
having the low tone. 


(7) nī ‘sun? + sd ‘hot’ — nísá ‘sunshine’ 
(8)  xol'stop' + mun ‘place’ — xolmün ‘resting place’ 


Morphological processes such as suffixation (in Dimasa) and compounding (in Tedim 
Chin) causes tonal shift or ‘tone sandhi’. 


4. Phonological processes 
Dimasa and Tedim Chin share some phonological processes such as vowel length, deletion, 


gemination and glottalisation. While Dimasa has presence of insertion, metathesis and 
assimilation, Tedim Chin does not share any of these processes. 


8 The presence of the grammatical tone is still not explored in many of the Kuki-Chin languages. In Bodo-Garo 
languages, there is no grammatical tone. 
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4.1. Vowel length 


Dimasa does not have long vowels while it is presently difficult to distinguish the vowel 
quantity in Tedim Chin. In Dimasa, vowel length occurs only in low tone which is not short 
and glottal, due to the absence of a glottal stop. The monosyllabic mja ‘yesterday’ is low and 
long which is in contrast to the short glottal mja? ‘bamboo shoot’. In Tedim Chin, vowel 
length occurs in mid tone as in STH ‘ginger’ which is slightly falling in contrast to ein ‘shake’ 
which is short and mid. si:y is perceived as phonologically longer than siy due to the tonal 
behavior. 


4.2. Deletion 


Vowel deletion occurs word-initially and medially in Dimasa though the deletion occurs more 
medially than initially and finally.? Some examples are illustrated in (9) and (10) with the 
deletion of vowels occurring in the initial and medial position. 


(9) amaufa > maufa ‘uncle’ 
(10) matla > mala > mla ‘young girl 


In (9) the vowel /a/ is deleted from the initial position. mau/a is more used than amau/a 
today. In (10), /a/ changes to /o/ and finally, the vowel is lost to form the consonant cluster 
mla ‘young girl’ in fast speech. 

In Tedim Chin, deletion of vowels and consonants occurs only word medially when the 
root word is suffixed to form a verb or a noun phrase as illustrated in (11) and (12). 


(11) ke + m  - ken ‘don’t be’ 


(12) ama? + m — aman ‘by him/her’ 
3sG ERG 


The vowel Ju in ker and m are lost in (11) as the Tedim Chin language does not 
diphthongise a vowel. This is a case of monophthongisation. Similarly in (12), the glottal stop 
/?/ is lost as it never occurs before a vowel and the vowel Ju as such, that is, vowels are 
always monophthongised in the language. This kind of deletion happens only with the -rn 
suffix. 


4.3. Gemination 


Gemination in Dimasa occurs in derived words. In (13), the suffix -ma nominalises the bound 
adjective ham- which results in geminating the word to form a noun phrase ‘being good’. 


(13)  ham-ma 
good-NMZ 
‘goodness’ 


? Consonant deletion in the final word position is found in the Hawar and other dialects of Dimasa. The voiceless 
velar stop /k^/ is dropped word finally only when it is followed by the /i/ sound as in b/ik ‘daughter’ becomes b/u 
and the vowel also changes to /u/. 
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In (14), the bilabial /p/ in hap ‘enter’ is voiced to /b/ when suffixed to the nominalizer -ba 
which is also a case of regressive assimilation besides voicing and gemination. 


(14)  hab-ba 
enter-NMZ 
‘entering’ 


Gemination is found in syllable boundary in compound words in Tedim Chin as shown in 
(15) and (16). 


(15) xut-tum 
hand-grip 
‘fist? 
(16)  liam-ma 
injure-area 
*wound' 
In (15) xut means ‘hand’ and -tum is ‘grip’. In (16) liam means ‘injure’ and -ma is ‘the 
injured area’. 
Both Dimasa and Tedim Chin have gemination occuring only word medially, that is, the 
coda of first syllable and the onset of second syllable. Besides this, Dimasa uses 
nominalisation to form gemination while Tedim Chin uses compounding to form this process. 


4.4. Assimilation 


Dimasa is highly productive in consonant assimilation, not in vowels. Most of the 
assimilation happens when a nasal is followed by a stop as illustrated in (17). 


(17) Jain + bili |— Jaimbli > faimli 
sun time ‘evening’ 


In the above example, /n/ becomes /m/ when followed by /b/ due to their occurrence in the 
same manner of articulation, which is bilabial. A rule has been framed here to state this, 
which is a case of regressive assimilation. 

(18) n>m/ b 
4.5. Dissimilation 
/n/ changes to /h/ with the loss of the nasal /n/ in the second syllable as shown in (19). 
(19)  maggil? > maghil? ‘forget 
A rule has been framed here to state this. 
(20) yn>h/_@ 


Both mayyil? and mayhil? in Tedim Chin are used interchangeably by speakers of the 
language. 
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4.6. Metathesis 
Some words in Dimasa can have metathesis as shown in (21) and (22). 


(21)  huk"ir > huk'ri ‘hungry’ 
(22)  f'urfa t'rufa ‘Muslim’ 


/c/ and /i/ in (21) and /r/ and /u/ in (22) have been transposed to each other's positions in 
the same syllable. huk'ir and huK^ri in (21) and turja and ft^rufa in (22) are used 
interchangeably by the Dimasa speakers. 


4.7. Glottalisation 


The glottal stop occurs in the final position in a word in Dimasa although glottalisation can 
occur in the word-medial and final position in monosyllables as shown in (21) to (24). 


(23) no?» no? ‘house’ 
(24) hap > harp ‘enter’ 


Glottalisation is not natural in the word medial position in Dimasa. In exceptional cases, it 
occurs in this position in fast speech as shown in (25) and (26). 


(25) hadifa > ha?dfa ‘Bengali’ 
(26) habfau > ha?bfau ‘universe’ 


In Tedim Chin, glottalisation in monosyllables is seen in the verb stem when Stem 1 
alternates to Stem 2. It occurs only in the word final position as in lau and lau? ‘afraid’ and 
also yar and yar? ‘love’. In these two examples, lau and yaz are the Stem | verbs whereas lau? 
and gar? are the Stem 2 verbs which alternate according to the syntactic constructions 
(Henderson 1965: 72).!° 


5. Conclusion 


Dimasa and Tedim Chin, being languages from the same Tibeto-Burman family, have many 
phonetic and phonological features in common. While Dimasa is equally monosyllabic and 
disyllabic, Tedim Chin is mostly monosyllabic. Being tonal languages, vowel length is 
phonologically present in these two languages besides tone sandhi. Vowel deletion in both the 
languages occurs. Interestingly, this deletion results in monophthongisation in Tedim Chin. 
Dimasa has mostly regressive assimilation while Tedim Chin has dissimilation. Metathesis is 
found only in Dimasa. Nominalisation leads to gemination in Dimasa while compounding is 
the case in Tedim Chin. Lastly, glottalisation of monosyllables and disyllables has been found 
in the word medial and final positions, but not word initially. 

However, the high allophonic variation in Dimasa raises questions if some of them are 
really identifiable phonemes. In Tedim Chin, it is not clear whether the glottal stop /?/ is an 
allophonic variant of /h/. /h/ occurs word initially and medially while /?/ occurs word 
medially and finally. Identifying the number of tones in Dimasa by different linguists is not 
consistent with several claims on two-tone, three-tone and four-tone systems. Tedim Chin 


10 Henderson (1965: 72) referred to the two alternating verbs as Form I and Form II. 
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vowel length remains an issue whether it has phonetic long vowels. A deeper study on the 
function of grammatical tone in Tedim Chin is necessary. These are some of the problems in 
Dimasa and Tedim Chin, which need to be carried out for further research. 


Abbreviations 


3sG Third person singular 
CLS Classifier 

ERG  Ergative 

IMP Imperative 

NEG Negation 

PFV  Perfective 
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1. Introduction 


The Koch language is spoken by a group of people of the same name. It belongs to the Bodo- 
Garo, or Bodo-Koch group of the Tibeto-Burman language family (Benedict 1972; Burling 
2003a: 175-178; Joseph and Burling 2006: 1-4). Koch has been considerably influenced by 
prolonged contact with Indic languages, which is evident in its phonology and grammar. 

The Koch people are mainly found in the West Garo Hills district of Meghalaya and the 
Goalpara and Dhubri districts of Assam. There are pockets of the Koch population in the 
northern part of the East Garo Hills district of Meghalaya, as well as in the Nagaon district of 
Assam, the Jalpaiguri district of West Bengal and in Tripura. After the separation of East 
Pakistan and its later transformation into Bangladesh, many Koch fled to the neighbouring 
Indian states of Meghalaya and Assam. 

The Koch people divide themselves into nine ethnolinguistic groups. Each group is 
thought to speak a different dialect. Nowadays, however, the speakers of only six groups, 
namely Tintekiya, Wanang, Harigaya, Margan, Kocha (or Koch-Rabha) and some Chapra 
continue to speak their original mother tongue. The other three groups, Satpari, Sankar and 
Banai speak either Hajong (Indo-Aryan) or a mixed language which contains features of 
Bengali and Assamese and is sometimes referred to as Jharua, a derogatory term meaning ‘of 
the jungle’? (Kondakov 2013: 8). Hajong itself is sometimes called Jharua (Majumdar 1984: 
151; Hajong 2002: 17). Nevertheless, many Koch know the Harigaya variety since it is 
commonly used at Koch social gatherings and is reportedly easy to learn. 

Not much linguistic work has been done on Koch. There is only scarce information in the 
‘Linguistic Survey of India’ (Grierson 1967). From the historic side, there is a short account 


! T would like to thank Dr. Mary Ruth Wise for her valuable consultation at the initial stages of writing this 
paper, and Dr. Paul Arsenault for his further consulting and review. 

7 Bengali and Assamese are closely related languages, and in the area under survey they merge into each other in 
a continuum resulting in transitional forms such as Jharua. For this reason, a combined term Bengali/Assamese 
will sometimes be used when referring to the Indo-Aryan form of speech used by the Koch. 
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of the Koch Kingdom in the ‘History of Assam’ (Gait 1933). A few individuals from India 
have done their PhD studies on Koch, but mostly in the field of anthropology?. Therefore, the 
Koch language has remained largely unstudied. At the same time, our knowledge of other 
languages of the Bodo-Garo group has been enriched by scholars such as R. Burling, U.V. 
Joseph, S. van Breugel, and others, to whose works we will refer in this paper. 

The present paper aims at describing the phonological system of Koch with primary focus 
on the Harigaya variety. The data consist of about 1,000 words collected during fieldwork in 
2008 and 2009 from native speakers and later cross-checked. The following software were 
used to analyse the data: Speech Analyser 3.0.1 and Phonology Assistant 3.0.1. This paper 
presents descriptions of phonemes, syllable structure, phonological processes, prosodic 
features and a list of about 1,000 Koch words in phonemic writing. 


2. Phonemic inventory 
2.1. Consonant phonemes 


Koch has 25 consonant phonemes including four voiced aspirated ones and the glottal stop 
(shown in parentheses) 


Table 1 — Consonants 


p|bl|t|d kl ¢ |@ 
pi | (bY) | t^ | (a) kK | (g') 
f| de 
(d3^) 
s h 
m n 7 
fF 
w J 
l 


The Koch stops are generally characterized by contrasts between voiced vs. voiceless and 
unaspirated vs. aspirated. Voiced aspirated stops are originally not characteristic of Koch. 
However due to the Indo-Aryan influence some of the Koch dialects have adopted the whole 
row of voiced aspirated stop phonemes such as /b"/, /d"/, /g"/ and /d3"/. These have initially 
entered the Koch sound system through loanwords, then gradually acquired the status of 
native phonemes and began to spread to the native words and even to some loanwords where 
there was previously no aspiration at all. Thus a phenomenon of hypercorrection took place. 
A very similar phenomenon has occurred in Rabha (Joseph 2007: 19). The process of 
aspiration — deaspiration is not uniform across different Koch speech varieties, so one can 
find words with voiced aspirated stops in one variety and corresponding words with no 
aspiration in another variety (see more in §6). 

The glottal stop has the status of a marginal phoneme in the present analysis. It is not a 
frequently occurring sound in Harigaya Koch. It usually serves as a barrier between two 
echoing vowels as in /be?e] ‘where?’ or separates identical vowels belonging to different 
morphemes as in /na-?a] ‘(he) hears-PR'?. There are very few native words where the glottal 


3 The most recent PhD dissertation that deals with language aspects of Koch is written in Assamese by A. B. 
Mandal, University of Gauhati, India, 2010. 

^ The status of the glottal stop will be discussed below. 

7 Cf. contiguous occurrence of /a/being divided by a glottal stop in Rabha (Joseph 2007: 105). 
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stop occurs in the coda of a non-final syllable as in /ma?.wa] ‘boy’. The Koch Spelling Guide 
recommends that this sound be represented by the apostrophe (’) (K.D. Harigya et al. 2009). 
Whether this is a residue of a formerly present phoneme (perhaps tone), influence of Garo or 
another phenomenon needs to be ascertained by further inquiry. 


2.2. Vowel phonemes 


There are six vowel phonemes in Koch. 


Table 2 — Vowels 


a 


It is difficult to locate /e/ and /9/ precisely in the phonetic chart: they are somewhere 
halfway between close-mid and open-mid position, apparently as in Rabha (Joseph 2007: 50- 
51); 

[o] has an allophonic counterpart /o/ when it is affected by the phenomenon of vowel 
harmony and/or by stress. /o/ is often perceived as a distinct sound by some Koch writers. 

/a/ is the Koch variant of what has been called in the Bodo-Garo linguistics a "sixth" 
vowel with somewhat special status (Burling 2009). It has its counterparts in Rabha, Garo and 
Boro variously represented as /uv, /i/ or /ə/. It is a mid central unrounded vowel, and local 
Koch writers represent it as 4 and & in the Roman and Assamese orthographies respectively. 


This phoneme appears to be closer in articulation (towards /i/) in the Kocha, Wanang, and 
Chapra varieties and is often conditioned by vowel harmony (see $4.3). 

Koch vowels do not show a contrast in length. Normally a stressed vowel is slightly 
longer than an unstressed one. 

All vowels can occur in open as well as closed syllables. Whenever there is a tendency of 
a two-vowel sequence, the glottal stop, /j/ or /w/ is inserted (see more in 84.2.3). 


3. Koch syllable 


Koch, just as the related language Garo, lacks contrastive tones which is a rather uncommon 
phenomenon in Tibeto-Burman. However, as in many tone languages, the syllable is an 
important phonological unit (Burling 1981: 61; Joseph and Burling 2006: 3) and, for the 
purpose of phonological analysis, is often more relevant than the word. Therefore in 
describing the Koch phonological system it 1s essential to discuss the syllable, its structure 
and distribution patterns. It is also appropriate to have further discussion on consonants, as to 
whether they occur syllable-initially (in syllable onsets) or syllable-finally (in syllable codas). 


3.1. Syllable structure 


There are six syllable types in Koch, as shown in Table 3. Note that the last two syllable types 
occur relatively rarely and only at the end of di- or trisyllabic words. Moreover the CCV type, 
which is more frequent among the two, is a verbal suffix -tra. The CC sequence per se is 
somewhat unstable in Koch (see more in 3.3). Thus, the canonical structure of the regular 
Koch syllable is (C) V (C). 
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Table 3 — Syllable types 


V u ‘that’ (distal demonstrative) 
CV na ‘fish’ 
VC ay T 
CVC don ‘become’ 
.CCV | p'uj.trə ‘[he] came’ 
.CCVC | man.trey ‘pestle’ 


3.2. Syllable-initial consonants 


Practically all consonants except /7/ and the glottal stop can occur in syllable onsets. /s/ is 
frequently pronounced with aspiration as in /s/57/^ ‘village’. When followed by the close 
back rounded /u/ it gets somewhat retracted and raised and sounds more like /// as in /musrut] 
*wipe'. 

Syllable-initial consonant clusters are not found in the majority of the Koch varieties. 
Kocha, however, shows the presence of some types of initial clusters, namely, consisting of a 
stop followed by a liquid such as /kr/ in krəw ‘language’. 


3.3. Syllable-final consonants 


Voiced stops, aspirated stops, affricates and /h/ do not occur in syllable codas. Only the 
following 11 consonant phonemes can be found in that position. 


Table 4 — Syllable-final consonants 


p -t -k (-2) 
-5 
-m -n BI 
-r 
-j -w 
-l 


Voiceless stops are unreleased. /// is very rare syllable-finally: it is found predominantly in 
loanwords and only in few native Koch words. Final /s/ is restricted to loanwords just as in 
Garo (Burling 2003b: 389; Joseph and Burling 2006: 20). The glottal stop is rare and never 
occurs in syllable codas word-finally as in Garo (Ibid., 21). /y/ affects the preceding vowels 
/e/ and /5/ by increasing their closeness. 

In many multisyllabic words the final consonant of the first syllable immediately precedes 
the initial consonant of the second syllable as in sak.maj ‘fly’. Here /k/ and /m/ consonants are 
adjacent, though they do not form a cluster. CC cluster may occur in the onset of the non- 
initial syllable of some words such as man teen ‘pestle’ or kak.tra ‘(it) bit’. In the second 
example /-tra/ stands for a verbal suffix which is perceived by some Koch speakers as /-tăra], 
thus, making it a CV.CV type, not a CCV’. There are no word-final consonant clusters. 


$ A similar phenomenon is found in the related languages. The Tiwa initial /s/ is strongly aspirated (Joseph and 
Burling 2006: 5). In Rabha the initial /s/ is pronounced with greater friction when followed by a high-toned 
vowel (Joseph 2007: 47). It would be interesting to compare such Rabha words with their Koch cognates. 

7 The Harigaya suffix /-t(ă)ra/ has its cognate /-tana/ in Wanang Koch. 
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4. Phonological processes 
4.1. Processes due to surrounding segments (assimilation) 
4.1.1. Spirantization 


In Koch spirantization occurs in voiceless aspirated plosives which become fricatives between 
vowels as in: dup^ut > [dugut] ‘snake’, bak^aj > [baxaj] ‘throw’. Spirantization is not unique 
to Koch: it is also found in the same Rabha phonemes (Joseph 2007: 43-44). 

In the speech of some Harigaya speakers a similar process takes place in the affricates 
[d3] and /t tf] that turn into voiced palatalized alveolar fricative /z// and voiceless palatalized 
alveolar fricative /s/] respectively after a front vowel or the voiced palatal approximant [j] as 
in: bed3at > [beziat] ‘how’ , hetfa > [hesia] ‘bad’, p'ujdzok > [p'ujziok] ‘[he] has come’. 


4.1.2. Velar stop lowering 


A velar stop gets significantly lowered and can be perceived as a uvular stop due to preceding 
open central unrounded vowel (which may sometimes be accompanied by the glottal stop) as 
in: sakle > [sakle] ‘early’, ma?ka > [ma?ka] ‘be absent’. 


4.1.3. Consonant lengthening 


The following consonants can occur slightly lengthened: /p/, [t], [k], [n], ft]. [i] and [I]. 
This is typically the case when they are found in an intervocalic position within roots or at 
morpheme boundaries as in: (banal ‘skin’, [mat:a] ‘big’. 


4.2. Processes due to syllable structure 
4.2.1. Syncope 


In tri- and polysyllabic words a middle unstressed vowel is unstable and is often lost in fluent 
speech. This is often the case with derivatives as in: da-patak > [daptak] ‘CAUS-crack (intr.)’ 
> ‘crack smth.’, ga-saraj > [gasraj] ‘CAUS-rise’ > ‘wake smb. up’, *bi-bila > [bibla] 
*which-time' > ‘what time, when’. The stress in these words falls on the final syllable from 
the end (for more on stress in Koch see 5.1 Stress). 


4.2.2. Desyllabification 


In words, ending with a vowel a locative suffix (el ceases to be the nucleus of the syllable 
and becomes the coda of the previous syllable by changing into /-j/. In this way the CVC 
pattern is produced. This is a common phenomenon in fast or casual speech as in: *rumba-e > 
rumba-j *inside-LOC' > ‘inside’. 


4.2.3. Epenthesis 


Epenthesis is a very common process in Koch?. It occurs at morpheme boundaries when a 
root ends in a vowel and a following suffix begins with a vowel as in: *¢/asi-a > tjosijo 


* This is also true for Rabha (see Joseph 2007: 101-102). 
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finger- DEF’ > ‘the finger’; "tura-& > turaje *"Tura-LOC' > ‘in Tura’; *sedga-a > sedgawa 
*porcupine-DEF' > ‘the porcupine’; *pantfu- -2 > pantfuwo *thorn-ACC' > ‘the thorn’. 

From the above examples one can see that the insertion of an epenthetic glide between 
two successive vowels takes place under the following condition: /j/ is inserted when the 
preceding sound is a front or a mid-central vowel and /w/ when the preceding sound is a back 
or an open-central vowel. In the pronunciation of some speakers, however, the epenthetic /7/ 
and /w/ can freely interchange; in other cases it is solely the glottal stop that plays the role of 
an epenthetic element. 

The glottal stop is invariably inserted between the verb root ending with the open central 
vowel and the present tense suffix /a/ as in: *sa-a > sa?a ‘eat-PR’ > ‘(he) eats’. In other cases 
the process is as usual with the epenthetic element depending on the quality of the preceding 
vowel, e.g.: *li-a > lija ‘go-PR’ > ‘(he) goes’; *d3u-a > dzuwə ‘lie down-PR' > "(hel lies 
down’; *sasa-a > sasawa ‘child-DEF’ > ‘the child’. 


4.2.4. Deletion 


In fluent speech the illative suffix /-ay/ requires the deletion of a preceding central vowel as 
in: *rumba-ay > rumba-g ‘inside-ILL’ > ‘inside (I.Y; *tura-ay > turag ‘Tura-ILL’ > ‘to 
Tura’. The same illative /-a7/ requires epenthesis in words ending with a close vowel (cf. 
§4.2.3). 


4.3. Vowel harmony 


In Koch vowel harmony works both progressively and regressively. Close vowels /i] or [u] 
affect the vowel /a/ by changing it into /a/ as in: *tika > tikə ‘water’, *matd3u > motd3u 
‘woman’, *haluwa > haluwa ‘farmer’. [a] preserves its quality when the word does not 
contain a close vowel. Cf.: d'araj ‘sharpen’, peraj ‘buy’. Regressive vowel harmony actively 
works in prefixes in causative verbs, adjectives and adverbs. It is triggered by the root vowel 
and spreads regressively onto prefix as in: sa > ga-sa ‘CAUS-eat’ > ‘feed’; dgu > gu-d3u 
*CAUS-lie down’ > ‘make smb. lie down’; masay > t'o-mosog *CAUS-sit' > ‘make smb. seat’; 
kaj > da-kaj ‘CAUS-fall’ > ‘make smth. fall’. In adjectives the unproductive prefix pV- can 
assume either the vowel Je! or [i] depending on the vowel of the following syllable as in: 
pe-nem ‘good’; pe-nek ‘black’; pi-dan ‘new’; pi-sak ‘red’; pi-buk ‘white’; pi-law "Jong"? 

The process of vowel harmony also affects demonstratives/the third person pronouns i 
‘this’ (proximal) and u ‘that’ (distal): they lower to e- and o- respectively when conjoined by 
the plural suffix -rəy as in eror *these/they' and »ror ‘those/they’. 

The phenomenon of vowel harmony works similarly in Rabha, except that it is normally 
regressive (Umlaut, according to Joseph 2007: 126-127). A few instances of the opposite 
process is referred to as progressive contact assimilation (ibid.: 128). Regressive vowel 
harmony (assimilation) is documented for Tiwa (Joseph and Burling 2006, 33-34, 38). The 
dialect of Garo spoken in Bangladesh shows cases of progressive vowel assimilation (ibid.: 
37-38). But in Tiwa and Garo vowel harmony is not always consistent. In Boro vowel 
harmony is shown in the adjective prefix gV- where V stands for the vowel whose quality 
depends on the vowel of the following syllable (ibid.: 39). 


? This adjectival pV- prefix is found in a few Rabha words and has cognates in other Boro-Garo languages: gu- 
in Boro, gi-/git-/gip- in Garo and ko- in Tiwa (Joseph and Burling 2006: 34-35; Burling 2009). 
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5. Suprasegmental phonology 
5.1. Stress 


In Koch the phonetic correlate of stress is a combination of length and loudness. Koch stress 
is phonologically predictable: in simple words it invariably falls on the last syllable from the 
end as in: bic da ‘water lily’, ra deg ‘king’, Eat ben ‘bulbul’. 

Words with inflectional suffixes carry stress in two places — one on the last syllable of the 
root and the other on the last syllable of the suffix as in: hutf'uy-a -na ‘turtle-DEF-ACC’ > ‘the 
turtle’. 

In compound words (including proper nouns) or those derived from two roots there are 
also stress in two places as in: p^e pora ‘dza ‘Snake King’. 

The nature of double stress in the latter two cases needs further investigation as there are 
polysyllabic words with longer roots as well as the roots followed by more than two suffixes. 


5.2. Contrastive pitch 

Koch is one among a few Boro-Garo languages that have abandoned tone. However pitch 
may have become lexicalized in a few instances, e.g. in order to distinguish certain 
demonstratives which are otherwise pronounced equally. This phenomenon has developed in 
order to indicate intensification in certain contexts of space and time. The contrasting syllable 
is pronounced with a high pitch and is normally lengthened. Consider the following: 


(1) ay way towa. 
I there live 
‘I live there.’ 


(2) an u:an towa. 
I over there live 
* live over there.’ 


(3) üj dinə. 
that day 
‘The day before yesterday.’ 


(4) uj dina. 
that-REM day 
‘On that day (i.e. earlier than the day before yesterday).’ 


6. Sound correspondences between different Koch varieties 


A number of sound changes have occurred across different Koch varieties. It is a matter of 
historical linguistics to investigate the nature of such changes and reconstruct the proto-forms 
of a language. In this paper we shall only consider the main interdialectal phonological 
correspondences without an attempt to reconstruct the proto-Koch forms. 

In the process of vowel change across the Koch varieties there are certain exceptions, e.g. 
in tiniņ ‘today’ (WK) the first vowel is /i/, not the expected (ail, 
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Table 5 — Changes in vowels 


Vowels | HK? | WK Meaning 

i €» aj li loj ‘go’ 
eeaj | ame amaj *mother' 
u<a |dukum | dakam ‘head’ 
u<aw| anuy | anaw | ‘younger sister’ 
2< aw | bantok | bantaw ‘eggplant’ 


Table 6 — Changes in consonants 


Consonants | TK HK WK K Meaning 
te tf tiko tfiko ‘water’ 
tot thi tfi ‘blood’ 

k'ek«etf | nak'ar | nakor | natpr ‘ear’ 

d e d3 dimi d3imaj ‘tail’ 
web war bar ‘fire’ 
he- hək ok ‘stomach’ 
k'e h k'opak | hopak *skin? 
k^ h e$ | uk'ajha | uhujto | uguito *(he) is hungry’ 
pod piuj aj ‘come’ 
lon lin nan ‘drink’ 
loner lampar | nampar | rampar ‘wind’ 
men muk nək ‘see’ 
yon p'ujmuy | gajman ‘having come’ 


Table 7 — Aspiration — deaspiration 


Consonants | K HK Meaning 
LST pan pan ‘tree’ 
ten tolok | t'olok ‘run’ 


" gj g'oj ‘betel nut’ 
EOE bigina | beg^enek | ‘how’ 


As it was mentioned in section 3.2, syllable-initial consonant clusters are not found in 
most of the Koch varieties. In fact, there has been the apparent tendency of cluster 
simplification in Koch as one can see from the following examples: 


Table 8 — Cluster simplification in Koch 


K HK Meaning 
kr — kVr | kray | karay | ‘feather’ 
kren | keren ‘bone’ 
pr—pVr | prat | parat ‘tear’ 
prey | peren | ‘understand’ 


i The following abbreviations are used for denoting Koch groups: HK - Harigaya Koch, K — Kocha, TK 
— Tintekiya Koch, WK — Wanang Koch. 
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7. Loanword phonology 


Koch has for a long time been in close contact with the surrounding Indo-Aryan languages 
Bengali, Assamese and Hajong. The result of this contact is evident both in phonological and 
grammatical systems of modern Koch. However the major impact is seen in the lexicon: Koch 
has been drawing heavily on Bengali, Assamese and Hajong for loan words with the effect 
that a bulk of Indo-Aryan loanwords now makes up the Koch vocabulary. 

Having entered the Koch lexical system the loanwords did not remain totally unchanged. 
Many of them fell under the influence of the Koch phonological system and sometimes 
display the following adaptations. 


a) devoicing — final voiced plosives become voiceless as in: bag > b^ak ‘part’, gərib > 
gurip ‘poor’, dud" > dut ‘milk’; 

b) metathesis — bemar > beram ‘ill’, beleg > begal ‘different’; 

c) aspiration — here the case of hypercorrection takes place: Koch speakers associate 
voiced aspiration with Assamese loanwords and over-apply it to cases where it should 
not normally apply as in: dean > dlan ‘noun classifier (human)’, nidgora > d'ora 

‘stream’, tfakor > tfak"or ‘servant’; 
d) deaspiration.- p'esa > [pesia] ‘owl’, mad3"ari > mad3ar ‘middle’; 
e) post-nasal stopping: dgamura > E ‘lime’; 


The process of cluster simplification mentioned in the previous section also affects some 
loanwords, e.g. the word for ‘prayer’ tends to be pronounced as part'ona rather than the 
typically Indic prart^na. 


8. Conclusion 


The present study aimed at providing a short description of the Harigaya Koch phonology. 
Some areas could not be dealt with in depth, therefore they require further investigation. This 
includes, but is not limited to, the following. 


1. The problem with the glottal stop mentioned in $2.1: what is its proper place in the 
phonological system of Koch? What is its origin and what is the relation with its 
cognates in other BG languages? 

2. The nature of the controversial fo] which is considered an allophone of /5/ in the 
present analysis, but is perceived as a separate sound by some Koch writers. 

3. The nature of the initial /s/ and its comparison with Rabha cognates. 

4. Prosodic features of Koch, especially the nature of stress in polysyllabic words and 
contrastive pitch. 

5. Phonological description of other Koch varieties such as Wanang, Kocha (Koch- 
Rabha), Chapra, Margan and Tintekiya; their full-scale comparison with each other 
(including Harigaya) and with other BG languages. 
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Word list 


Parenthesized abbreviations for Koch linguistic varieties: CK — Chapra Koch, K — 
Kocha/Koch-Rabha, MK — Margan Koch, TK - Tintekiya Koch, WK — Wanang Koch; words 
with no abbreviation are from the Harigaya variety. 


Nature bloom parni 
sun rasan water lily b'eda 
sunrise rasan dotni grass sam 
sunset rasan lini/dagni straw hapren 
Quem nangret, naggrek (TK), bamboo wa, ba (K) 
rangret (WK), naret (K) fruit thaj 
M ramp^ut, ramp"uk, mango botfot 
ramp^u (TK) banana likt'aj 
water tikə, t{ika (WK), ti (TK) jackfruit pantfuy 
dew; mist pamti plum sinit 
rain ran lemon libu 
cloud rambu, rambu (K) wheat gom 
lightning din tfilkajni rice (uncooked) | majruy 
thunder din taratni corn mak'oj, maktu 
rainbow ramd"onu vegetables; ; f 
lisam, nisam 
ana lampar, nampar (WK), curry 
rampar (K) potato han-goglok, alu 
storm rag-lampar sweet potato tharmaray 
earth, soil ha eggplant bantok, bantaw (WK) 
stone loyt'aj groundnut badam 
path lam, lamdor chilli j'aluk 
stream j'ora garlic rusun 
sand hatfen, hantfeg (K) clove lon 
fire war, bar (K) tomato belati bantok 
d'una, wartfu (WK), turning creeper | aprad3ita 
smoke wark"^u (TK), burtfuk betel nut gzj, gj (K) 
(K) lime dzambura 
ash t'apal ginger tfinkut 
mud hadel a tuberous plant | hambuk 
duct d'ulo, hagur (K), haput bean harek 
(TK) 
flood d'ol-tiko 
lake doba Fauna 
forest talaj fish na 
electric eel karen-na 
crawfish hen 
Flora prawn natfen 
tree pan, p'an (K) porpoise na-ner 
bark pan hop:a bird, chicken tow 
leaf ləjtfak cock, rooster taw-bajar 
root tfikra, tfatar (WK) hen taw-matd3u 
thorn pantfu chick taw-sasa 
flower par sparrow tfamtfura, tfontfa 
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owl p'esa ant 
bulbul k'ojtobleg termite kokon, tuluy 
peacock mojra spider makra 
heron boga leech iluk 
dove g'ugu bug deba 
pigeon kujtur louse k'irik, klirit 
duck hantfi flea iron 
swan radza hantfi worm d3ir 
crow kawra caterpillar enda 
kite tawley centipede hembram 
egg tawti snail lukruy, sukruy 
animal d3onar horn karan 
cow məsu tusk p'utfin 
ox bolot trunk, proboscis | sur 
water buffalo musi, b^us paw tala 
puruņ, purun (WK, K, tail dimi, dziməj (WK) 
poat TK) beak thot 
sheep bhera puruy wing; feather karay 
bighorn sheep ram puruy bone kerey 
pig wak, bak (K) nest b'atfa 
deer makt[5k, maksok 
dog kuj, kaj (K) Person 
wolf ram kuj Kinship terms 
m mejaw, mejor (K), ad3i man morot, marap (K) 
(WK) woman matd3u, tiri (TK) 
tiger masa father awa, baba 
fox piir mother əjə, ame, amaj (WK) 
jackal sijəl parents əjə-awa 
porcupine sedza child sabet, sasa 
rat motf5t older brother dada 
snake dup"ut younger brother | ad325 
python ad3arngar older sister adza 
chameleon mukulandi younger sister anuy, anaw (WK) 
lizard tfelabari son ma?wa sasa 
frog lewak daughter mətdzu sasa 
turtle hutfun, dura nephew baynaj, b'agina 
crocodile timblay uncle mama, kaku 
monkey kawi husband mija 
insect tjon wife mitfik 
fly səkmaj, sopmaj grandfather ətfu 
butterfly tokpak grandmother əbu 
bee ne ancestors abu-at/uray 
mosquito sok boy maPwa 
ant semar girl motd3u 
big black ant gonga maiden mitfala 
black poisonous n bachelor bant'aj 
aben - 
ant daughter-in-law | bəw 
red poisonous sambur son-in-law kilag 
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brother-in-law dewra TN mukruy tiko, makar 
: 3 d3abok, numbuk, mu(k)tfi (WK, K) 
sister-in-law . 
nawsali 
the youngest Food 
member of a taemur, turti food sani (dzinis) 
family oil tel 
salt sum 
meat kan 
Body parts fat butum, batam (WK, K, 
body kan CK) 
head dukum, dakam (WK) milk dut, nono (K) 
temple tfip rice beer tfokot 
hair hogra, hawar (WK) bamboo shoot gad3a 
face mahuy boiled rice maj 
cheek tapa fried rice ladu 
mukruy, mukun (CK), stale rice majham 
eye mokon (TK), nukun bread; cake p'ap 
(MK), məkər (WK, K) dry fish na-səw 
ear nakor, natfor (WK), fermented fish na-pisaw 
nak'ar (TK) 
nose nakun Work 
lip hutfin Agriculture 
mouth hutfur paddy field b'a 
tooth pia, ġa (WK) wasteland ha-bak'ra 
gum p'atrin water canal dzan 
chin kakam oung padd rowa 
tongue telaj, talaj (K) 
neck; throat tukur Occupation 
chest hapak, k'apak (CK), farmer həluwə 
buk (TK) servant tlak"or 
stomach, belly ok, hok (TK) house owner 
: g nokbar 
navel nasti wealthy person 
intestine peta Household 
hand, arm tfak village SOY 
palm tfak tala native village sognok 
elbow kirkun camp bada 
finger tfosi house nok, nagaw (K) 
fingernail tfosukuri room kut^uri 
shoulder kar tfal, nur (K), nuk'rujg 
le tfat^un, tfadam (WK, toot (TK) 
back kund3u pole tar 
waist sin string tenol 
buttocks d'er nokot, nəkət (WK), 
skin hopia kopak (TK) gonr nakap (CK) 
mole tinik house pillar k^uto 
body hair mun fireplace plogkar 
heart tfalpak firewood wapan, bapan (K) 
blood t^i, tfi (WK) broom konten 
sweat g'omar 
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stick kondam coin sik:ə 

mortar sankam song tfaj 

pestle mantrey sound huray 

pan hata cold tfikni 

winnow dzep, wan back dher 

small bamboo = status, rank gadan 

basket Pun kind, sort g'enek 

earthen pot mutuk place, ground hadam 

fish pot g"ili land, country hason 

fish catching l medicine dzaba 

instrument paia rubbish dzabra 

boat riy light d3asin, d3isin (K) 

hammer hatur dream dgoman 

a large cleaver- dones story kirsa 

like knife language, kərə, karak (CK), karaw 

axe wasi, basi (K) speech (WK), kraw (K) 

hand fan dzepp'al music tamni 

rope kura lie, untruth p'atra 

thread hintin witch dakal 

needle simi coward dipur 

cloth soka, sokok (WK, K) celebration, hifi 

baby’s cloth ` basek party uris 

bamboo mat d'araj place of worship | t'an 

pillow hodam deity waj, baj (K) 

storehouse tfasan 

animal shed k'owar 

well tfuwa Time expressions 

bridge damal time okot 

drum hem moment d'omok 

flower vase pardzum day san, din 

Koch women’s E night p'ar, p'arok 

top wear : monop, manap (CK, 

Koch women's joa idR TK), manaw (K) 

dress dawn, daybreak | p'ar-monop 

jar kumbaj, kambaj (K) noon san mad3i 
evening gosom, gasam (WK, K) 

Other nouns now taj, ela 

thing d3inis G tini, tinin (WK, K), tajni 

name muy (TK) 

part bat Yesterday lini ganek 

side p'ak the day before | di 

middle mad3'ari yesterday eg 

edge kortfa tomorrow p'ujni ganek 

hole, cavity gata week sopta 

footprints tfaman month mas 

shadow sajna year bosor 

stain, spot gap early sakle 

rust maran daily dinni 
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still tajbon, tajban (WK), quiet tfitfik 
elaw angry haw 
recent elajni true hatfəj, kantak (K) 
untrue mitfəj 
Adjectives 
old majtfam Physical actions 
new pidan be, become don 
pelem, gelen, penem be, exist ton 
good (WK, K), pemen (CK) ma?ka, era (CK), baba 
hetja, hitía, sartfa (K), EE (TK), toptfa (K) 
bad nati (TK), nakt*a (CK), be alive hen, Ven (CK, TK) 
henda (MK) can man 
wet sumni, samni (CK) be able kap, k'ap (K) 
dry rani be cold tfik 
hot taw, burni (TK) be wet sum, sam (CK) 
cold gisiņ, tfikni, tfokni be dry ran 
(WK), tfikraw (K) be thirsty karan 
sour hini uhuj, ujuj (WK), uk*aj 
right DE be hungry (T. K) guj (WK) U 
left debra be tasty t'aw 
big mata, goda (K) be sweet sum 
small turti, akuj be lost marat 
short k'atak be shy panok 
high pira be full pu 
broad d'apa fear kir 
straight salsala be ripe mun, man (K) 
beautiful pelemsa ; masoy, ambak (K), nu 
clean, clear sap sit (down) (TK) : 
impure suwa lie down; sleep | dzu 
dark andar come p'uj, oj (WK) 
dim ama-sama reach sok 
fair kambok go li, loj (WK) 

: ibuk, boksam, bo?sam, lidgum, lud3um, ladzam 
white m lan (WK, K) walk WK. K) 7 7 
black penek move back wal 
blackish penenek return wal p*uj 
red pisak (WK, K), nal run tolok, tolok (K), dawraj 
poor gurip jump p'alp^raj 
ill beram n uraj, pur (TK), pu (WK, 
different alda, alea, begal y K) 
other adek, dosra dive tilup 
former agani leave, flee d3ar 
fragrant butumni turn g^uraj 
Nis munni, paman/pamun turn, roll petfaj 

(K) dance bosa 
raw pe?t'en come out, dt 
strong datik appear 
far had3an, pidzan (WK, hide lukəj 

K), dzanni (TK) 
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remain, stay p'agat think suņsuy (WK, K), susum 
ascend, rise dur (CK), baba 
descend hur listen natem, natam (K) 
fall kəj learn; teach sikaj 
sink dubəj like; need ləgi, niga (K), lam 
look tfoj eat sa 
feel gok, sundat drink lin, lay (WK, CK), nay 
cry hep (K) 
die t^i, tfi (WK, K) swallow molok 
rot pisow bite kak 
meet bay chew tfobaj 
burn, be on fire. | ham, k'am (CK, TK) sting ditfuk 
have a fever kalum haw, law (WK, K), 
boil bet, rot give lahaw (MK), lakha 
be sour hi (TK), ak"o (CK) 
shine tfi, d3in take la, lag 
suit ja?dar get, receive man 
crack, explode | patak put tan 
shake J'akaraj bring naha, laha/laga (K) 
wait d3"iraj, sam lift; carry paj 
swell pok send wasek, beset (K) 
pain gaman, sa catch low, log (CK) 
cough tohot collect, gather sambuk 
bathe (tika) lu buy peraj 
warm oneself dap sell pial 
take oneself wear, put on = 
across ee KR jun 
reduce tfam lay, spread dan 
bow bam rub, smear sik, sit 
fight kalup cook (maj) lum, lam (K) 
do rek, rat (WK), tak (K) serve a meal thk 
make, build t'em, t'ari, t'arəj hug ankolaj 
play gel pull but 
play a musical "" push d'eka haw 
instrument pull out p'2k, pap 
speak bak shoot kaw 
sing (aj) lum close tfup 
laugh mini open golok, melaj, mekaj 
shout ki cover d'okaj 
scream partfek wrap meraj 
stammer, stutter | p’uk tear tfer, d'abat, parat 
call paw, kalan (WK), klay pluck dak 

(K) peel daw 
ask sin tfaj mill, grind dikit 
crow sep winnow tfaw 
SS muk, nok (WK), nuk SOW kaj 

(TK, CK, MK) enter day 
hear na sharpen d'araj 
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move loraj touch dogok 

take/bring down | d'or fill dup"uy 

throw bak'aj, bakaj (K) h t^umuk, t'unuk (CK), 

break buj uid tanak (K) 

cut handok lose t'amarat 

pierce suduk dry tharan 

burn (smth.) saw bathe t'uluk 

beat tok thrust gaday 

kill tat make smth. : 

hang alaj, tanaj stand Sard) 

chase, hunt d'awaj make clean gid3in 

drive, chase gadzar take out gədən 

burry bukaj take down gud"ur 

save batfaj lay down gud3u 

want gon 

give up gədza haw Adverbs 

know rin, das man here jere 

search bitfərəj there were 

carry ga here, hither jag 

bear, endure sowaj there, thither wan 

wash gin, gan (K) near kandaje 

wipe musruk, musrut dod piraje, piri (WK), kara 

Sweep nek CK, TK, 

suck bosop, busup, basap (K) below tfahaje, tlarek (WK) 

blow (by ahead akay 

mouth) DS back pasan 

vomit p'at behind d'erpasay 

graze tfaraj tfahejan, tfarek (WK), 

restrain t"amaj cownwares kamo (CK), kama (TK) 

ignite, kindle talan westwards rasan liniwag 

help ged'ey quickly d'am-d'am, taratori 

forse (mon) mandak/ suddenly hotase, hotaten 
wandak/wandat then tane, weren 

make a hole doblor, hot yet taw 

feed gasa so te 

give to drink git'ilin 

make smb. sit t'omoson Postpositions 

wake smb. up gasraj before, in front ake, ago (WK, CK, K) 

make smb. lie T of 

down gudzu inside, under rumbaje 

make smth. fall | dakaj outside bajeran 

boil smth. debet below tJahaje, tJarek (WK), 

fry tek kamo (CK), kama (TK) 

dress, clothe dakan for d3one, talon (WK) 

stick dakap except; for bade 

cross dapat from taki, duro, haton 

crack smth. daptak by, though, with | dije 

give life to, save | dehey while, when bələ 
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after pore, patfe that u 
Sis piraje, piri (WK), kara something ataba 
(CK, TK) someone tlayba 
somewhere bejaynba 
Question words sometime biblaba 
who tjan like this inka, eg'enek 
lun ata, ato (WK), utu (K), like that unka, agtenek 
bita (TK) this much esoman, itfuku 
where bere, bejay, bin (CK) that much asoman, utfuku 
when bibla noun classifier l . 
how bed3at, beg'enek, binko (human) ali a s o 


how many, how 


noun classifier 


Ser? besoman, bitfuku (hobian) a- (WK, K), -ta 
what kind begtenek let us; OK de 
why atana, atay(n)a (WK, yes he 
K) hey, hello 2j 
prohibitive 
ta 
Personal pronouns marker 
I an emphatic ios 
you (sg.) nan, gan (K) particle í 
he, it jə, wa; jara, wara 
she id3u, ud3u 
nin (WK, K exclusive), 
we nuy (CK, MK), na?ay 
(WK, K inclusive) 
naro, nonok (K), napru 
you (pl) nuo d 
they Coon, orar, enok (K), 
ono (K), utru (CK) 
Quantifiers 
all bebak, gotal (WK, TK), 
dandak (WK) 
ayasan, melasan, 
much, many E (K) 
a little akujsa, tepeksa 
this much itfuku 
that much utfuku 
almost poraj, praj, ganan (K) 
Conjunctions 
and aro, ara (WK, K) 
but naten (K), kintu 
or (conjunctive) | ba 
or (disjunctive) | na 
therefore od3one 
Demonstratives, miscellaneous 
this i 
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3. Poula phonetics and phonology: An initial overview! 


Sahiinii Lemaina Veikho 
University of Bern, Switzerland 


Barika Khyriem 
North Eastern Hill University, Shillong 


This paper presents the first structural description of the syllable structure, sound segments and tonal 
patterns of Poula, an Angami-Pochuri language of Manipur and Nagaland. The canonical shape of Poula 
syllable consists of an obligatory vowel, a tone and an optional onset. The consonant inventory consists 
of 25 phonemes, and closely resembles the inventory of Tibeto-Burman languages of the area, with some 
interesting exceptions. There are six phonemic monophthongs and four phonemic diphthongs. An acoustic 
analysis of Poula monophthongs also shows that the EI and F2 acoustic space of /u/ has a large standard 
deviation. Poula has three contour tones (High-Falling, Mid-Rising & Falling, and Mid-Falling) and one 


Low-Level tone. 
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1. Introduction 


Poula (ISO 639-3) is spoken by the Poumai Naga, in the states of Manipur and Nagaland, in 
northeastern part ofIndia. The term Pou /pou/ refers to the name of the great-great- 
grandfather from whom all Poumais are believed to be descended, and the term Mai means 
‘people’. Thus, Poumai means ‘descendents of Pou’. The Poumai are one of the Naga tribes 
that reside predominantly in Senapati district in Manipur, as well as in a few villages in Phek 
district in Nagaland. During the colonial period, the community was considered to be part of 
the Mao tribe.? The community struggled for tribal recognition for more than fifty years, from 
1950 until 2003, when they were officially recognized as one of the distinct Naga tribes in 
India by the Government of India? (Barooah 2011). The Poumais are also known by different 
names: Poumei, Pumei, Paumei, Pome and Pomai. However, the term Poumai Naga 1s 
officially recognized by the Government of India. 

The Poumai Nagas live in over 100 villages in Manipur and Nagaland. According to the 
2011 Indian Census Report, there are 127,381 Poumai Nagas in Manipur. No official census 
data exists for Poula speakers in Phek district in Nagaland, but it is estimated to be between 
6,000 and 10,000. In Manipur, the villages where the Poumai Nagas dwell are divided into 
three blocks: Paomata, Lepaona and Chilivai under Senapati district (see Figure 1). The 
varieties spoken across these three blocks differ in terms of phonology and lexicon. However, 
even within the blocks there is significant phonological and lexical variation. A few varieties 
like Ngimaila (spoken in Oinam village), Raimaila (spoken in Ngari village), Dumaila 


! We would like to acknowledge Dr. Priyankoo Sarmah, Ismael Lieberherr, Amos Teo and the anonymous 
reviewer for their constructive criticisms and suggestions. 

? According to the information provided by the Poumai Naga Literature committee on 3'* Dec. 2013 (personal 
communication). 

3 The Act of Parliament ‘The Schedule Castes and Schedule Tribes Orders Amendment, Act 2002’ received the 
assent of the President on 7 January 2003. 
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(spoken in Khundai village) are commonly viewed as separate languages from Poula by the 
people within the communities, since these varieties are not intelligible to other speakers of 
Poula. Some varieties spoken in Paomata block are similar to Mao, another Angami-Pochuri 
language. 

In this study, we are looking at the variety of Poula spoken in Naamai village (also known 
as Koide village) under Lepaona block, which lies between the Paomata and Chilivai blocks. 
We chose this variety to examine, as it appears to be intelligible to most Poula speakers, since 
it is quite similar to the koine that was used to translate the Bible and church hymns. 


MANIPUR 


SENAPATI DISTRICT L 


Ukhrul district 


*9 Paomata Block 
@Lepaona Block 

6 Chilivai 
WB National Highway 2 


Imphal 


Figure 1: Map indicating the three blocks: Paomata, in the north; Lepaona, in the central area; and 
Chilivai, in the south.’ 


Poula is not mentioned in most classifications of Sino-Tibetan languages (Benedict 1972, 
Matisoff 1978, Bradley 2002). Recently, Lewis et al. (2013) have classified Poula as a 
member of the Angami-Pochuri group, which they consider to be a sub-branch of Kuki-Chin. 
However, in more conservative classifications (e.g. Burling 2003, van Driem 2011, 2014) 
Angami-Pochuri is placed outside Kuki-Chin, pending further evidence for any higher-level 
classification. 

Little linguistic work has been done on Poula. These are the Bible, which was translated 
by the Bible Society of India, and the hymns book. Thus, the present paper is a first step 
towards more in-depth research into Poula, beginning with a phonological analysis of the 
language. 

The data for this study were elicited from a male, native speaker (26 years of age), and 
later cross-checked with two (1 male and 1 female) other native speakers of similar ages. All 
the subjects are originally from Koide village. All the data were collected in Shillong, India. 


4 The basic structure of the map is taken from Dr. R. B. Thohe Pou’s online map, available at http://www.e- 
pao.net/news section/images/opinion/Map 2009 Thohe.png (accessed on 10th August 2015). 
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2. Syllable Structure 
The syllable structure of Poula, unlike most other Tibeto-Burman languages, restricts coda 
and clusters in onset; this is discussed in detail in $6. The canonical shape of Poula syllable 
consists of an obligatory vowel and a tone and the optional elements having the following 
linear strucuture: o =(C) Vi (V2) T 

The distribution of potential syllabic constituents within a syllable is listed in Table 1. 


Table 1: Phonotactic distribution of syllable constituents 


(C) Vi (V2) T 
pp bttdece”kk g High-falling 
te dz i uə* 
m n 0 Oh eo a* iuo Mid-rising and 
szeh falling 
f 
vlj Mid-falling 
Low 


* indicates the two possible vowels in V; for CV; V2 word. 
The syllable structure of Poula can also be represented metrically as having the 
hierarchical structure in Figure 2: o represents syllable, T represents tone, C represents a 


consonant and V represents a vowel. 


o LT 
Onset Rhyme 
Nucleus 


(C) Vi (V2) 

Figure 2: Syllable structure of Poula 
The diagram illustrates that a Poula syllable can minimally consist of a monophthong 
vowel nucleus and can maximally consist of a consonantal onset (C) and a diphthong nucleus 


(Vi and V2). The possible syllable structures are illustrated in Table 2. 


Table 2: Possible Syllables in Poula 


Word Tone Syllable Type Gloss 
JU 21 Vi I 

/ki/ 21 CVi house 

/kai/ 51 CV1V2 knife 
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Poula has both light and heavy syllables. Based on the data collected and analyzed, light 
syllables are more common in Poula. Heavy syllables in Poula are restricted to a nuclues 
having Vi and V2 with C at the prevocalic position. With reference to consonant combination, 
Poula does not permit consonant clusters. 


3. Consonants 


There are 25 phonemic consonants in Poula. These are listed in the Table 3 according to the 
place and manner of articulation. 


Table 3: Consonant phonemes 


Bilabial | Labiodental | Alveolar | Palatal Velar Glottal 
Stops 
unaspirated |p b t d c* k e 
aspirated p. t gus ei 
Nasal m n j^ 0 
Tap or Flap f 
Fricative S 6 Z h 
Affricate te dz 
Approx D j“ 
Lateral 
Approximant l 


* indicates rare phonemes 


The following minimal pairs demonstrate the contrast between voiceless unaspirated and 
voiceless aspirated stops occurring word-initially: 


1) /p/ versus /p*/ 


/pa?!/ [pa?!] *face' 

/pə?!/ [pə] ‘mother’ 
2) /t/ versus /th/ 

/ta?!/ [ta?!] ‘go’ 

/to?!/ [to?!] ‘food’ 
3) /c/ versus /c^/ 

/ca/ [ca] ‘talk’ 
4) /k/ versus /k"/ 

/ko?!/ [ko*!] ‘drink’ 

/ko?!/ [ko?'] ‘story’ 
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/p^a?!/ [p^a?!] 
/pho?!/ [p^o?!] 


/tha2!/ [ta?!] 
/tho2!/ [tho?!] 


/cha/ 


[cha] 


/kha?!/ [k^o?!] 
/kho?l/ [k^o?!] 


‘guess’ 
‘kind of rope to carry things’ 


‘fast’ 
‘place to hang’ 


‘waist’ 


‘bathing’ 
‘best friend’ 


3. Poula phonetics and phonology 


The following minimal pairs demonstrate the contrast between voiceless and voiced stops: 


5) /p/ versus /b/ 


/pai??/ [pai?] ‘cup’ /bai??/ [bai??] ‘chin’ 
/po?!/ [po?'] ‘wrestling’ ^ /bo?'/ [bo?!] ‘branch’ 
6) /t/ versus /d/ 
Hae [tañ] ‘rotten’ /da??/ da] ‘stand in queue’ 
/to???/ [to??] ‘congested’ ` /do??/ [do???] ‘yesterday’ 


7) /k/ versus /g/ 
/ke?!/ [ke*!] ‘muscles’ /ge?/ [ge?!] *question marker' 


The following minimal pair shows that the palatal stop /c/ and velar stop /k/ are phonemic. 
The /c/ phoneme is a rare phoneme and it only occurs with the vowel /a/. 


8) /c/ versus /k/ 
/ca?!/ [ca?'] ‘husband’ /ka?/ [ka?!] ‘pillar’ 


The following minimal pairs demonstrate the phonemic voicing contrast between the 
voiced and voiceless alveolo-palatal affricates: 


9) Ite versus EN < n 
ters [tei?'] ‘news’ /dzi?'/ [dzi?!] ‘divide’ 


fteo?!/ [teo?!] ‘cow’ /dzo?!/ [dzo?!] ‘similar’ 


The following minimal pairs demonstrate the phonemic voicing contrast between voiced 
and voiceless alveolo-palatal fricatives: 


10)/e/ versus /z/ 
/ga??/ [ca?ni?] ‘tiring? ^ /za?ni?/ [zaPni?] ‘regularly on and off 


/gou??/ [eou] ‘thick’ —/zou?/ [zæ] ‘position’ 


The following minimal pairs demonstrate the contrast between voiceless alveolar and 
voiceless alveolo-palatal fricatives: 


11)/s/ versus /e/ 
/sa?!/ [saà?] cat fea [ea^] ‘tired’ 


/sou?*/ [sou?] ‘meat’ /eau?3/ [eou] ‘thick’ 


The following minimal pairs demonstrate a phonemic contrast between the nasals: 


12)/m/ versus /n/ 
/mai?/ mari ‘people’ /nai?!/ [nai?!] ‘father’s sister’ 
/mo!!/ [mo!!] ‘no’ /no!!/ [no!!] ‘soft’ 
13) /n/ versus /y/ 
/na?!/ [na?!] ‘things’ /ga?!/ na: ‘hire’ 
/ne?!/ [ne?] ‘you’ /ge?/ [pe] ‘block at the throat’ 
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14) /y/ versus /'0N 
/ga?!/ [na*!] ‘hire’ /j^a?!/  [pha?!] ‘buffalo’ 
/gjou?!/ naun ‘five’ /nau?!/ [n^ou?!] ‘snake’ 


The following minimal pairs demonstrate a phonemic contrast between the alveolar tap 
and alveolar lateral: 


15)/r/ versus /l/ 
/ra?/ [raà?] ‘sweet(n.)’ /laà?/ la: ‘language’ 
/ro?!/ [ro?!] ‘basket’ /lo!!/ [lo!!] *bed' 


3.1. Description of Consonant Phonemes 
3.1.1. Stops: 


/p/ is a voiceless unaspirated bilabial stop; it is always realized as [p] (see 16). However, some 
varieties close to Mao villages in Paomata block, produce it as [6] due to spirantization. In 
these varieties, it is represented in the practical orthography by the allograph f, e.g. fü [$9] 
‘mother’. /p^/ is a voiceless aspirated bilabial stop; it is always realized as [p^] (see 17). /b/ is 
a voiced labial stop; it is always realized as [b] (see 18). 


Word-initial Word-medial 
16) [poi?!] *head' [la?po??] ‘shallow’ 
[pa?*na>!] ‘twins’ [na?pai?!] *younger (female) 
17) [p^a?!] ‘guess’ [ta?p^ao?!] ‘begin to move’ 
[p'au?!] ‘spade’ [ki?p^ai??zu?!] *house floor sweeping" 
18) [ba?!tou?!] ‘finger’ [tco??ba?!] ‘cow dung’ 
[bai?!] ‘face’ [i?bai?!] ‘it’s me’ 


Figure 3: Spectrograms and acoustic waveforms of /p^a/ ‘guess’ and /ba/ ‘hand’, recorded in isolation, 
showing the positive and negative voice onset time (VOT) for [p^] and [b] respectively. 


ba ‘hand’ 
O.433356009 


mm 
T 


T | 
- 


Time Ca 


/t/ is a voiceless unaspirated alveolar stop; it is always realized as [t] (see 19). W is a 


voiceless aspirated alveolar stop; it is always realized as [t^] (see 20). /d/ is a voiced alveolar 
stop; it is always realized as [d] (see 21). 


52 


3. Poula phonetics and phonology 


Word-initial Word-medial 
19) [toi??19?!] ‘time’ [ta??ta?! zu?!] ‘farming’ 
[tao??zu?!] ‘fat’ [na?!to?!] ‘baby food’ 
20)[ta7koi?zu?'] “light (weight)’ [ta??t^a?!] ‘going fast’ 
[tozu] ‘itchy’ [mai? the? zy?!] *kicking someone" 
21) [da?!] ‘beat’ [na??da?!] ‘lefty’ 
[de?ta??zu?!] ‘falling’ [ne?do?!] *your wish* 


/k/ is a voiceless velar stop; it is always realized as [k] (see 22). /k*/ is a voiceless 
aspirated velar stop; it is always realized as [k^] (see 23). /g/ is a voiced velar stop; it is a rare 
phoneme in Poula which occurs only in the question marker before the vowel /e/ (see 24). 


Word-initial Word-medial 
22) [ka??1o?!] ‘thank you’ [mai?ko?!] ‘story’ 

[kai??mai?!] ‘clever person’ [zaike*!] ‘coagulated blood’ 
23) [kte?!] ‘let us go’ [lo kha?!] ‘bag’ 

[k®a??mai?!] ‘friend’ [naZ?k^a?!] ‘junior’ 
24) [ge?!] *question marker" 


Figure 4: Spectrograms and acoustic waveforms of /ke/ ‘muscle’ and /ge/ ‘question marker’, recorded in 
isolation, showing near-coincident VOT for [k] and negative VOT for [g]. 


ke ‘muscle’ ge “question marker* 
0 70.249024 943 
CR d HII Wal o = 


0.37547619 


haec i 


-ve VOT 
HERE d^ 


Kg 


Time G> o 


-0.183 
0.45 o 0.751 
Time (2) ‘Time €» 


/c/ is a voiceless palatal stop; it is a marginal or rare phoneme in Poula and occurs only 
with /a/ in syllable-initial position (see 25). /c'/ is a voiceless aspirated palatal stop; it is a rare 
prevocalic phoneme in Poula and is found only with the vowel /a/ in syllable-initial position 
(see 26). 


Word-initial Word-medial 
25) [ca?! ] ‘a tool for weaving’ [a?ca?!] ‘my tool (for weaving)’ 
26) [c^a?!] ‘waist’ [a?2cha?!] ‘my waist’ 
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3.1.2. Nasals 


/m/ 1s a voiced bilabial nasal; it is always realized as [m] (see 27). /n/ is a voiced alveolar 
nasal; it is always realized as [n] (see 28). /y/ is a voiced velar nasal; it is always realized as 
[n] (see 29). /h"/ is a voiceless aspirated velar nasal; it is always realized as [ij^] (see 30). 


Word-initial 


27) [ma*3mai?"] ‘sinners’ 
[mao!!zv?!] ‘drown’ 
28) [na?pu?!] ‘younger’ 
[na?!] ‘child’ 
29) [yai?zu?!] ‘regret’ 
[oK a: *big frog" 
30) [9 ug ‘buffalo’ 
[5^ou?!] ‘snake’ 


Word-medial 
[məi?mə?!] 
[soi??mi?!] 


[lo? na? zu?!] 
[co?na?!] 


[pao?gao?zu?!] 
[mai?gao??zu?!] 


[poi? ^oi?! ] 
kréi 


‘get together’ 
‘dog tail’ 


‘patient’ 
‘morning’ 


‘explain’ 
‘showing others’ 


‘head louse’ 
‘to me’ 


Figure 5: Spectrograms and acoustic waveforms of /na/ ‘hire’ and /ġ*a/ ‘louse’ which illustrate a fully 


pa ‘hire L 


Time Cm 


Wm 


i atem M 


voiced [y] and devoiced [i]. 
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e i 


HUNT 


ITT 


‘Time (=> 
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3.1.3. Fricatives 


- 


3.6230 -0.305% 


0.4384 
Time ec? 


/s/ is a voiceless alveolar fricative; it is always realized as [s] (see 31). /e/ is a voiceless 
alveolo-palatal fricative; it is always realized as /e/ (see 32). /z/ is a voiced alveolo-palatal 
fricative; it is always realized as [z] (see 33). /h/ is a voiceless glottal fricative; it is always 


realized as /h/ (see 34). 


Word-initial 
31)[sa?zu?!] ‘happy’ 
[sozu] ‘knowing’ 


32) fea zu] 
[ei?zu?!] 


‘tired’ 
‘bad’ 
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Word-medial 


[va??sou?!] ‘skin’ 
[sou**lu”’so**zu!] ‘ability to do’ 
[loe E) ‘love’ 

[ma??ea?!] ‘himself? 


33) [za?!] ‘portion’ 
[zi?'zu?!] ‘distribute’ 


34) [ha!!] ‘blessing’ 
[hozu?!] ‘puff? 


3.1.4. Affricates 


3. Poula phonetics and phonology 


[sou?za?!] ‘payment for working’ 
[a?za?!] ‘my share’ 

[lou?hi?!] ‘folksong’ 

[mai?hai!!] ‘dirt on body’ 


/te/ is a voiceless alveolo-palatal affricate; it is realized as [ts] before [9] (see 35). /dz/ is a 
voiced alveolo-palatal affricate; it is realized as [dz] before vowel [o] (see 36). 


Word-initial 
35) [teo?!] ‘cow’ 


[tei?!] ‘less’ 
[tso?! ] ‘mind’ 
36)[dzou?] ‘today’ 


[dzo?^zu?!]*get together’ 
[dzo?] ‘water’ 


Word-medial 


(aiao *my cow? 
[pu?tei?!zu?!] *headstrong? 
[a dzao?!] ‘my shirt’ 
[5^a?dzo?zv?!] ‘not sufficient? 


Figure 6: Spectrograms and acoustic waveforms of /dza/ ‘share’ and /tei/ ‘less’ to illustrate the difference 
in voicing between the alveolo-palatal affricates. 


o.26> 


3.1.5. Tap and Lateral 


Jr is a voiced alveolar tap; it is always realized as [r] (see 37). /l/ is a voiced alveolar lateral 
approximant; it is always realized as /l/ (see 38). /v/ is a voiced labio-dental approximant; it is 


always realized as /v/ (see 39). 


Word-initial 
37)[re?zw?!]. ‘writing’ 
[ro?!] ‘a person's will’ 


38)[la?zu?'] ‘standing’ 
[10?!] ‘bed’ 


39) [val] ‘looking after’ 
[vao*!] ‘frog’ 


Word-medial 


[ta?2re?!] ‘gone’ 

[sa?! rai?! ] *cloth thread" 
[192?1ou?!] ‘slow move’ 
[ya?lou?] "a type of song? 
[ne**voi"!] ‘yours’ 
[vao?zu?!] ‘stealing’ 
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4. Vowels 
The six phonemic monophthongs found in Poula are represented in table 4. 


Table 4: Vowel phoneme chart 


Front | Central | Back 
Close i u 
e d 0 
Mid 
Open a 


4.1. Monophthong minimal pairs 


The following minimal pairs demonstrate the phonemic contrasts between the monophthongs. 


40) /i/ versus /9/ 

/ki?!/ [ki?] ` "house? /kə?!/ [ko?!] ‘drink’ 

/mi?!/ [mi?!] ‘tail’ /mo?!/ ma: ‘mouth’ 
41) /a/ versus /o/ 

/na?!zu?'/ [na?!zu?!] ‘being late’ /no?!zu?'/ [no*!zu?!] ‘pampering’ 

/pa?!/ [pa?!] ‘face’ /por!/ [po?!] *mother 
42) A/ versus /e/ 

/ki?!/ [ki] ‘house’ /ke?/ [ke?!] ‘muscle’ 

/mi?!/ [mi?!] ‘tail’ /me?!/ [me?!] ‘dream’ 
43) /u/ versus /o/ 

/ou?!/ [vu?!] ‘belly’ /oo!!/ [vo!!] ‘pig’ 

/mu!!/ [mu!!] ‘close’ /mo?!/ [mo?!] *no' 
44) /a/ versus /e/ 

/ra?!/ [ra?] ‘sweet’ deelt "Tee? *jungle' 

/ma?!/ [ma*!] ‘pumpkin’ /me?/ [me?!] *dream' 
45) /o/ versus /o/ 

/lə?!/ [lə] ‘boil’ /lo!!/ Doll ‘bed’ 

/kə?!/ [kə] ‘drink’ /ko?!/ [ko?!] ‘story’ 
46) /u/ versus /o/ 

/nu?!/ [nià? ] ‘wear’ /nə!!/ [no!!] ‘laugh’ 

/mu!!/ [mu!!] ‘close’ /me?/ [mo*!] ‘mouth’ 
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3. Poula phonetics and phonology 


4.1.1. Description of monophthongs 


h/ 1s a high front unrounded vowel. It is always realized as [1] and can occur in word-initial, 
word-medial and word-final position (see 47). 


Initial Medial Final 
47) [??!] T [si zu?!] ‘bad’ [mai?li!!] ‘one person’ 


/e/ is a high-mid front unrounded vowel. It is always realized as [e] and can occur in 
word-medial and word-final position (see 48). 


Medial Final 
48) [ne?!ni?] ‘by you’ [pu?3e?!] ‘to him’ 


/a/ is a low central unrounded vowel. It is always realized as [a] and can occur in word- 
initial, word-medial and word-final position (see 49). 


Initial Medial Final 
49)[a!'tso*!] ‘my mind’ [ta?ta?!] ‘farming’ [pou?! pa?!] ‘flower’ 


/o/ is a mid central vowel. It is always realized as [o] and occurs in word-medial and word- 
final position (see 50). 


Medial Final 
50) [so?zu?!] ‘to know’ [a? no] *my pants' 


/u/ in Poula seems to alternate in production between a high back [u] and high central [a], 
but is always produced with rounded lips. For more details, see §4.1.2 below. /u/ occurs in 
word-medial and word-final position (see 51). 


Medial Final 
51) [pu $! zu?!] ‘to borrow’ [ma?zu?] ‘nightmare’ 


/o/ is a mid back rounded vowel. It can be realized as either [o] or [9], which are in free 
distribution. It occurs in word-medial and word-final position (see 52). 


Medial Final 
52) [po?! zu?!] ‘to carry’ [a2 mo?! ] ‘my brother-in-law’ 


4.1.2. Acoustic analysis of monophthongs 


For both the vowel and tone acoustic analysis, one male native speaker was recorded. The 
data were recorded using a Samson 01 USB, unidirectional, microphone with the help of the 
software programme Praat 5.3 (Boersma & Weenink 2013). The data were recorded in a quiet 
room and the sampling frequency during the recording was set at 44100 Hz. For vowel 
analysis, the data from (40) to (52), both monosyllabic and disyllabic, were digitised by 
recording three repetitions in isolation. The first two formants (F1 and F2) were calculated at 
vowel midpoint using Burg algorithm. Since the sound files were in Hertz values, these values 
were then transformed to Mel values using Praat’s inbuilt function by using the formula in (a). 
After the Mel values were extracted, these values were used to plot the vowels space using the 
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Norm suite that is available online (Thomas & Kendall 2007). The ellipses inside the plot 
indicate the standard deviation. 


a) x-550]1n(14x/550) 


A 
S i 
L 
S 3 
S 
S e 


Figure 7: Poula acoustic vowel space plotted in the F1 and F2 dimensions 
The results of the acoustic analysis showed that F2 values of the vowel /u/ have a large 
standard deviation. For this Poula speaker, /u/ varies between central [u] and back [u]. Future 
studies of /u/ will look at more speakers to determine the extent of variation for this vowel. 


4.2. Diphthongs 


Poula has four rising phonemic diphthongs. For all diphthongs, the onglide begins from two 
initial targets: [o] and [a], and moves towards two terminal targets [u] and [o] (see Figure 8). 


Front Central Back 


a 


Figure 8: Poula diphthong chart 
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3. Poula phonetics and phonology 


4.2.1. Diphthong minimal pairs 
The following minimal pairs demonstrate the phonemic status of the diphthongs. 


53)/a1/ versus /ou/ 


/pai?!/ [poi?!] *head' /pou?!/ [pou?!] ‘father’ 

/doi?!/ [doi?!] ‘land’ /dou!!/ [dou!!] ‘learnt’ 
54) /a1/ versus /ai/ 

/mei?/ [moi?!] ‘fire’ /mai?!/ [mai?!] ‘people’ 

/koi?'/ [koi?!] ‘hear’ /kai?!/ [kai?!] ‘horn’ 
55) /a1/ versus /ao/ 

/soi?!/ [soi?!] ‘dog’ /sao?!/ [sao?!] ‘punch’ 

/hoi?!/ [hoi?!] ‘iron’ /hao?zu?/ . [hao**zu?!] ‘hanging’ 


56) /ao/ versus /ou/ 
/tao?!/ [tao?!] ‘fat’ /tou?!/ [tou?!] ‘eat’ 
/rao?!/ [rao?!] *bone' /cou!!/ [rou!!] ‘head gear’ 


57) /ao/ versus /ai/ 
/mao!!/ mao *drown' /mai?!/ [mai?!] ‘people’ 
/lao!'/ [lao!!] ‘stream’ /ai?*zu?!/ [lai??zu?!] ‘economic’ 


From the above analysis, Poula has 10 vowels: six phonemic monophthongs /i u e o o ai 
and four phonemic diphthongs /ai ao oi ou/. 


5. Tones in Poula 


Like many other Tibeto-Burman languages of the region, Poula is a tonal language. An 
acoustic phonetic study presented in this section demonstrates how tones in Poula are realized 
phonetically by pitch, measured as fundamental frequency (Fo). Using the same procedure for 
the acoustic analysis of monophthongs, we recorded the minimal pairs given in (58) to (61) 
from the same subject. Three repetitions of these words were recorded in isolation. From the 
data collected, four lexical tones in Poula have been observed. Using the Chao tone number 
system (Chao 1930), the four tones are represented as Low (11), Mid-Falling (21), Mid- 
Rising and Falling (231), and High-Falling (51). The presence of these tones is illustrated by 
the minimal sets given below: 


Sonorant-initials 


58) /na!'/ [na!! ‘paint’ 
/na?!/ [na?! ‘things’ 
/na?!/ [na?!] *Jater" 
/na?!/ [na?!] ‘baby’ 

59) /la!!/ (all ‘easy’ 
/1a?!/ [la?!] ‘navel’ 
dea [1a?!] ‘overflow’ 
/1a?!/ (acht ‘bloom’ 
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Obstruent-initials 


60) /pa! !/ [pa!!] ‘regret’ 
/pa?!/ [pa?!] ‘torn’ 
/pa?3!/ [pa??*] ‘teasing’ 
/pa?!/ [pa?!] ‘yellow’ 

61)/da!'/ [da!!] ‘heat’ 
/da?!/ [da?!] ‘cheat’ 
/da?!/ [da??!] ‘beside’ 
/da5!/ [da?!] ‘in line’ 


This study is limited to monosyllabic words given in (58 to 61). In order to illustrate the 
contours in Poula tones, Fo is calculated at 10% intervals across a tone-bearing unit. 
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Figure 9: Acoustic realization of Poula tonemes across a time normalized vowel. 


Looking at the acoustic realization of these four tones, we can see that all four tones fall 
towards the end. Although this could be due to declination or intonation, one other possible 
reason for this is that Poula syllables appear to have historically lost their codas. The fall in 
pitch at the end of the syllable may reflect traces of these lost codas — post-vocalic consonants 
have been shown to contribute to tonal development in languages like Vietnamese (e.g. 
Haudricourt 1954). The results show that Poula has three contour tones (HF, MRF and MF) 
and one almost level tone (L) in Poula. 


6. Conclusion 


This study shows that Poula has 25 consonant phonemes and 10 vowels, of which six are 
monophthongs and four diphthongs. Syllables in Poula have no codas, and there are 
restrictions on consonants clusters in syllable onset position. The language also has 4 
contrastive tones. 

To conclude, we make some cross-linguistic comparisons with other related languages to 
give an idea of the similarities and differences between the phonology of Poula and that of 
other languages of Nagaland and Manipur. 

Firstly, Poula syllables lack codas and also disallow consonant clusters in onset position. 
Many Tibeto-Burman languages of the region generally allow consonants in coda position, 
although these are often restricted to a few plosive and nasal consonants. However, we rarely 
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find consonantal codas in the languages of the Angami-Pochuri group (Teo 2014). Most 
Angami-Pochuri languages also allow consonant clusters in onset, typically consisting of a 
stop followed by a liquid (Teo 2014: 113). With this in mind, we may find consonant clusters 
in varieties of Poula spoken in the Paomata block, which borders areas populated by speakers 
of Mao. Mao is an Angami-Pochuri language that allows clusters like /kr/ (Giridhar 1994). 

Looking at the consonant and vowel inventories, we note that the vowel system of Poula 
has six monophthongs, which is quite typical of the Tibeto-Burman languages of the area 
(Burling 2013).? The consonant inventory of Poula also tends to be fairly typical, but we note 
the presence of alveolo-palatal sibilants that are not reported in other Tibeto-Burman 
languages of the area — however, these other languages are often reported to have post- 
alveolar sibilant phonemes. More interestingly, Poula has an aspirated voiceless velar nasal, 
which is a fairly common phoneme in Poula, but no aspirated voiceless nasal at any other 
place of articulation. A voiceless velar nasal is not reported in any other Angami-Pochuri 
language, although it is common to find voiceless / breathy bilabial and alveolar nasals in 
these languages (see Blankenship et al. 1993 for Khonoma Angami, Harris 2009 for Sumi). 

With regards to tone, Poula has four tones, similar to what has been reported for Khonoma 
Angami (Blankenship et al. 1993), Mao (Giridhar 1994) and Chokri (Bielenberg & Nienu 
2001). However, other Angami-Pochuri languages can also have three tones: e.g. Sumi (Teo 
2014) and Khezha (Kapfo 2005); or even five tones in Tenyidie / Kohima Angami (Giridhar 
1980, Kuolie 2006). 

Finally, since this is the first description of the phonology of Poula, it is hoped that this 
work will form the basis for further research on this language, and that it will contribute to our 
understanding of the Tibeto-Burman languages of Nagaland and Manipur and their position 
within the larger Trans-Himalayan family. 
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Abstract Tenyidie, also known as Angami, is a Tibeto-Burman language spoken in the north-eastern Indian state of 
Nagaland, which borders Myanmar (Burma). Unlike other Tibeto-Burman languages in this area that 
have two or at the most three contrastive tones, it has four level tones, as exemplified below. 


Extra High: dá ‘to chop’ 
High: da ‘to pack’ 
Mid: da ‘to blame’ 
Low: da ‘to paste’ 


This study presents the report of the fifth tone by various researchers and discusses the nature and 
circumstances of the presence of the tone. This opposes the studies of others who report only four tones in 
the language. It is found that though recent studies point to the language having only four tones, 


grammarians of Tenyidie claim that there are five, capturing native speakers’ intuition. We present our 
phonological analysis to the fifth tone arguing for a bi-tonal representation basing our analysis on the set 
of morphophonemic alternations seen with the four tones. The fifth tone is observed as a tone made up of 
an overt High tone with a floating Mid tone. 
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1. Introduction 


Tenyidie is a Tibeto-Burman language spoken by the Angamis, a tribe that resides in the 
district of Kohima on the southern border of the state of Nagaland in India. Previous work on 
Tenyidie generally refers to it as Angami. Although the terms Tenyidie and Angami are used 
synonymously when it concerns the language, officially, Tenyidie is the name of the language 
and Angami is reserved for the name of the tribe and the people. Standard Tenyidie is a koine 
that is based mainly on a number of varieties spoken close to Kohima and in surrounding 
villages, including Khonoma village. 

In this paper, we address the question of whether Tenyidie has five tones, as described in 
previous work (e.g. Burling 1960, Kuolie 2006) or four tones as described in an acoustic 
analysis by Dutta et al. (2012). In this study, we find evidence for five contrastive tones: 
however we note that in the absence of any affixes, two of the tones are phonetically 
indistinguishable in terms of pitch. Rather, what distinguishes the two tones are consistently 
different patterns of morphophonemic alternations. 


2. Tenyidie word structure 
In general, the structure of all syllables in Tenyidie is CV. Exceptions to this are syllables 
composed of just V, as in /à/ ‘I’, and CVN in the first syllable of /kendz3/ ‘cannot’. There are 


only six vowels in the inventory /a, e, i, o, u, o/. Kuolie (2006) lists forty one consonants in 
his study as seen in Table 1. 
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Table 1 — Consonants of Tenyidie, according to Kuolie (2006) 


Articulation 
Bilabial | Labiodental | Dental/alveolar | Alveolar | Palatal Velar Glottal 
Place — 
|Manner Vl Vd! VI Vd VI Vd Vl Vd VI Vd VI Vd VI Vd 
p b t d k g 
Plosives 
ph th kh 
m n) n ü 1) 
Nasals 
mh nh fih 
Fricatives f V S zd. 2 h 
pf bv ts dz c j 
Affricates 
pfh tsh ch 
1 
Lateral 
Ih 
r 
Trill 
rh 
W y 
Approximants 
wh yh 


Non-derived Tenyidie words are mostly monosyllabic or disyllabic, basically CV or 
CVCV. There are a few non-derived trisyllabic words as well, for example, /kemend/, 
meaning ‘flirtatious’, and /ket"eg"6/, meaning ‘satisfying’. In a non-derived polysyllabic 
word, non-final syllables are always one of the six syllables: /ke, te, me, pe, ro, t^e/. It may be 
noted that these six initial syllables all have Mid tone, while the main lexical tonal contrast is 
found on the last CV syllable. Table 2 shows some of examples disyllabic and trisyllabic 
words in the language. The nomenclature and description of the tones is given in the 
following section. 


Table 2 — Examples of non-derived disyllabic and trisyllabic words with tones specified 


Word Tone on final syllable Gloss 

mené Low *soft? 

tekhü Extra High ‘tiger’ 

rodi Mid ‘to change’ 
kesa High ‘new’ 
kemena Mid ‘flirtatious’ 
ket"eg"ó Extra High ‘satisfactory’ 
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3. Previous studies of tone in Tenyidie 


A number of previous descriptive studies analyse Tenyidie as having five tones. Burling 
(1960) reported five tones in Tenyidie, as given in Table 3. 


Table 3 — Tones as reported by Burling (1960) 


Diacritic (on /o/) | Tone 

/6/ High 

/6/ Mid normal 
et Mid resonant 
/o/ Low even 
/6/ Low falling 


There has also been mention of the language having five tones by N. Ravindran (1974) 
and Giridhar (1980). Kuolie (2006) also describes five tones in the language, and gives 
examples showing a five-way contrast, as seen in Table 4. 


Table 4 — Tones as reported by Kuolie (2006) 


Word Gloss Tone 

pé ‘to incline’ High tone 

pé ‘fatty’ High-Low tone 
pe ‘bridge’ Mid tone 

pê ‘to tremble’ Low-High tone 
pè ‘to hit/shoot’ Low tone 


Although the names and phonetic descriptions of the tones differ from study to study, 
most researchers of Tenyidie agree that there are five contrastive tones. However, in a recent 
study by Dutta et al. (2012), a more thorough acoustic study of tones in Tenyidie was done, to 
determine whether all five tones were phonetically different in terms of pitch. Words with the 
five previously reported tones were recorded and the FO for each was analysed. The study 
concluded that there were only four tones which were phonetically distinct. In this study, the 
two tones following the tone with the highest pitch (the ‘High-Low’ and ‘Mid’ tones in 
Table 4) were eventually merged into a single tone. The tones were renamed Extra High, 
High, Mid and Low from the highest pitch to the lowest — Table 5 shows the tones as reported 
by Dutta et al. (2012), when compared to the previous five-tone analysis by Kuolie (2006). 
All these four tones appear to be level. 


Table 5 — Comparison of five-tone and four-tone analyses 


Five tone analysis Four tone analysis (Dutta et al. 2012) 

Word Gloss Word Gloss Tone 

pé ‘to incline’ pé ‘to incline’ Extra High 
pé ‘fatty’ pé both ‘fatty’ High 

pe ‘bridge’ and ‘bridge’ 

pé ‘to tremble’ pe ‘to tremble’ Mid 

pé ‘to hit/shoot’ pé ‘to hit/shoot’ Low 
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Note that in a study of a closely related variety of Tenyidie, spoken in Khonoma village, 
Blankenship et al. (1993) also found phonetic evidence for only four tones, which they 
represented with the numbers from 1 to 4 from highest to lowest in pitch, as seen in Table 6. 
However, it is unclear if we are dealing with a different tone system from that of the 
standardized variety. 


Table 6 — Tones as reported by Blankenship et al. (1993) 


Term Meaning 

su! ‘to wash face’ 

su? ‘in place of 

su? ‘to block (as of view)’ 
su? ‘deep’ 


4. The fifth tone 


From the previous sections, it is seen that although there are various reports of there being 
five tones in the language, there have at the same time been studies which report the number 
to being just four. Figure 1 shows the pitch traces of the pe minimal set. The pitch traces were 
generated using the software Praat (Boersma and Weenink 2013). Observe that the tones 
‘High 1’ and ‘High 2’ in the figure fall around the same pitch range as distinct from the other 
three tones. 


2.39315002 2,80877356 
300, + D Y 


2504 


2004 


Pitch (Hz) 


pe/ “to incline” pe/ “fatty” pe/ “bridge” pe/ “to tremble” pe/ “to hit" 


Extra High High 1 High 2 Mid Low 


Time (s) 


Figure 1 — Pitch traces demonstrating the pe minimal set 


While there might be only four distinct phonetic tones, there is evidence that points 
towards a fifth tone. This tone is not phonetically distinct to the High tone, but behaves in a 
different manner morphophonemically. 

Recall that Dutta et al. (2012) only posit one ‘High’ tone in their four-tone model. 
However, if we accept this analysis, we find that ‘High’ tones display two different patterns 
of morphophonemic alternations. Observe the data sets (1) and (2), using the diacritics of the 
four-tone system, where tonal changes occur on the negative imperative suffix -hie and the 
imperative suffix -/ie, depending on the final tone of the stem. 


(D OI //dà + hie// | =/da-hié/ ‘to chop’ + negative imperative 
(i)  //dà-hie// =/da-hié/ ‘to pack’ + negative imperative 
(i)  /là-hie/  =/la-hié/ ‘to do again’ + negative imperative 
(iv) //pedā + hie// = /peda-hie/ ‘to blame’ + negative imperative 
(v) //dà + hie// | —/dà-hie/ ‘to paste’ + negative imperative 


66 


4. The fifth tone of Tenyidie 


D D //peta + lie// = /petä-liè/ ‘to drive’ + imperative 
Gi) = //rali+lie// = /rali-lié/ ‘to rest? + imperative 
(ii) = //rali+lie// = /rali-lié/ ‘to slow down’ + imperative 
(v) = //radi+lie// = = /radi-lié/ ‘to change’ + imperative 
(v)  /fpelé-lie// =/pelé-lié/ ‘to believe’ + imperative 


We observe that in both (1) and (2), the stems in (ii) and (iii) have the same ‘High’ tone, 
but they trigger two different tones on the same underlying suffix. We see that the Extra High 
and one of the ‘High’ tones both trigger a Low tone on -hie and -/ie; while the Mid and Low 
tones, along with the other ‘High’ tone, trigger a Mid tone on these same suffixes. The ‘High’ 
tone in (iii) is therefore seen to be different from the ‘High’ tone in (11). We therefore propose 
that the High tone in (iii), ‘the fifth tone’, is a binary combination of the High tone, H, and 
either the Mid tone, M, or the Low tone, L, where only H is overtly realised and the second 
tone participates as a floating tone in morphophonemic alternations. The fifth tone is therefore 
either H(M) or H(L). 

Like the suffixes -/ie and -hie, the present continuous suffix -ba is also underspecified for 
tone, i.e. it is underlyingly ‘toneless’ and the tone on it is triggered by the verb it attaches 
itself to. However, -ba aquires either an Extra High tone or a High tone, as seen in (3) and (4). 


3G) (i) //peta + ball = /peta-ba/ ‘to drive’ + present cont. 
(ii) = //rali+ba// = =/rali-ba/ ‘to rest’ + present cont. 
(iii) = //rali+ba// = =/rali-ba/ ‘to slow down’ + present cont. 
(iv) //radi+ba// =/radi-ba/ ‘to change’ + present cont. 
(v)  /pelé * ba// =/pelé-bda/ ‘to believe’ + present cont. 
(4) O /Ne + ba// = /lé-ba/ ‘to slice’ + present cont. 
Gi) 16 + bo// = /lé-bà/ ‘to think’ + present cont. 
(iii) //kelé + ba// =/kelé-ba/ ‘to heat’ + present cont. 
(iv) //zé + ba// = /[ze-bà/ ‘to sell’ + present cont. 
(v) //zé+ba// = /zé-bà/ ‘to sleep’ + present cont. 


Here, we observe again that although the stems in (ii) and (iii) have the same ‘High’ tone, 
only stems with one of the ‘High’ tones, i.e. (iii), triggers an Extra High tone on the suffix, 
similar to stems that end with an Extra High or a Mid tone. In contrast, stems that end with 
the other ‘High’ tone, i.e. (ii), trigger the same tone on the suffix, similar to stems that end 
with the Low tone. Using the same analogy as with (1) and (2), the High tone in (iii) would 
now have to be represented as either H(EH) or H(M), because only then would this tone 
trigger the resulting Extra High tone on -ba. 

We see that representing the fifth tone as H(M) works for all the tonal processes in (1), 
(2), (3), and (4). Representing it as H(EH) will not work in (1)(iii) and Om) as that would 
trigger an ungrammatical Low tone on -hie and Je, respectively. Likewise H(L) on (3)(iii) 
and (4)(11)) would trigger an undesired High tone on -ba. 

To summarise, the examples in (5) show the morphophonemic alternations for: (i) High 
tone; (ii) the ‘fifth’ tone; and (iii) the Mid tone in the verb stem. 
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(5) OI ro. lí - -liè ‘rest? + imperative ‘take rest’ 
H -L 
(ii) rə. lí - lie ‘slow’ + imperative ‘slow down’ 
H(M) -M 
(iii) ra. dī - -lié ‘change’ + imperative "make change’ 
M -M 


5. Conclusion 


Although we find four phonetically distinct tones as previous phonetic studies have shown, 
morphophonemic alternations show that there are more than just four tones, since the ‘High’ 
tone exhibits two types of alternations. Hence, we argue for a fifth phonological tone which is 
not phonetically distinct from what has been called the ‘High’ tone by Dutta et al. (2012). The 
proposed representation of the fifth tone is that it is a combination of two tones, a High tone 
with a floating Mid tone. The fifth tone is bi-tonal where the Mid tone is not realised 
phonetically but participates in morphophonemic alternations. Future work will look at 
acoustic studies of the tones in the context of different suffixes. It would also be interesting to 
re-examine the tone system of the variety spoken in Khonoma, to see if Blakenship et al.’s 
(1992) acoustic study also missed this ‘fifth’ tone. 
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Abstract Sora belongs to the South Munda subgroup of Austroasiatic languages spoken in central and eastern 
India. In the mid 19th century, groups of Sora speakers migrated to the northeastern states of India, 
particularly to the province of Assam. The study provides a description of vowel inventory of Assam Sora, 
supported by acoustic analysis. The results of the study conclude that the Sora spoken in Assam has six 
vowels in its inventory: /i, e, a, u, a, o/. It is also observed that Assam Sora words are minimally 
disyllabic and the last sections of the study presents an analysis of vowel quality, vowel duration, fo and 
vowel intensity in the two syllables of disyllabic words in Assam Sora. The difference in vowel quality, 
vowel duration, fo and vowel intensity between two syllables point towards the existence of iambic stress 
in Assam Sora disyllables. 
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1. Introduction 


Sora is a South Munda language of the Austroasiatic language family spoken by the tribe of 
the same name in India. Stampe (1965) and Diffloth and Zide (1992) propose that Sora is a 
member of the Koraput Munda subgroup in Orissa and is also known as Saora and Savara. 
They also relate Sora to Gorum, Gutob, Gta?, and Remo under the same Koraput Munda 
subgroup. Zide (1976) further adds Juray to this list (see Figure 1).? The 2001 Census of India 
indicates that Sora is spoken by 252,519 individuals in sixteen provinces of India (see 
Appendix 1). A map of Sora speaking areas in Orissa and Assam is presented in Figure 2. 


South Munda 


Kharia Juang Koraput Munda 
Kharia Juang Gutob Sora 
Remo Juray 
Gta? Gorum 


Figure 1: Genetic classification of Sora (Stampe, 1965) 


! The authors would like to thank the audience of North East India Linguistics Society 2014 for their 
constructive suggestions and criticisms. The authors are indebted to the anonymous reviewers and the editors for 
their invaluable comments and suggestions. 

? However, Anderson (2001) rejects the Koraput Munda sub-grouping of Sora, claiming that Koraput is only an 
areal classification and not a genetic classification. Thus, Anderson's classification promotes Sora directly under 
South Munda subgroup and relates Sora only to Gorum. Anderson and Harrison (2008) further suggest that Juray 
could actually be an understudied Sora dialect. 
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Based on the Census reports of 1961, Kar (1979) shows that at the time 684 Sora speakers 
were also living in Assam of North East India. This number appears to have decreased to 406 
speakers, as reported in the 2001 Census of India. However, we observe that currently there 
are more Sora speakers living in Assam than mentioned in the official records. Our 
preliminary survey reveals that there are over 1,500 Sora speakers living in just two villages 
of Sonitpur district in Assam, namely, Singrijhan and Sessa. 

This study examines the variety of Sora that is spoken in Singrijhan tea estate of Assam. 
Kar (1979) shows that Sora speakers had migrated to Assam from Orissa in the 19" century as 
indentured tea labourers. Hence, we use the term Assam Sora to differentiate the Sora variety 
spoken in Assam from the Sora of central India. Moreover, while central Indian Sora has been 
studied to some extent, Assam Sora is an undescribed variety. Therefore, our analysis of 
Assam Sora is based on primary fieldwork data, while our comparisons with central India 
Sora are based on the available literature (Stampe 1965, Ramamurti 1986, Donegan 1993, 
Donegan & Stampe 2002, Anderson & Harrison 2008, 2011 etc.). 


Figure 2: Sora speaking areas in Assam and Orissa 


Generally, the Munda languages spoken in Assam are among the lesser-studied languages 
of the region. Sarmah et al. (2012) reveal that Munda languages in Assam have significant 
lexical similarities to their corresponding languages in central India. Considering this, in this 
study we aim to examine the phonological features of Assam Sora, focusing on its vowel 
inventory. Hence, the purpose is to give an exhaustive description of the vowel system of 
Assam Sora which can be used in future for comparison with the vowel system of central 
Indian Sora. 

Previous analyses of central Indian Sora have proposed three different vowel inventories. 
While Stampe (1965); Donegan (1993) and Donegan and Stampe (2002) suggest that central 
Indian Sora has nine vowels /i, i, u, 0, 9, a, €, e o/, Anderson and Harrison (2008) suggests that 
there are eight vowels /i, t, u, o, a, e, o al. Ramamurti (1986) presents an inventory of six 
vowels /i, e, a, o, u al along with some allophonic and idiolectal variations of the six vowels, 
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differing in height, length and roundedness. He suggests that there are three vowel length 
distinctions in central Indian Sora: long, short and normal. Additionally, while he finds that [1] 
and [o] are allophonic variations of /i/ and /u/, he suggests that [ü], described as a high, back, 
unrounded vowel, is an allophone of /u/. Anderson and Harrison (2008) suggest that while [ü] 
could be an archaic feature of the language, [1] needs further study. Regarding vowel length 
distinctions, they also suggest that vowel length is phonemically contrastive, but do not 
provide sufficient evidence of this. An unpublished description by Starosta (1964) had 
indicated that there is some overlap and free variation between the central Indian Sora vowel 
pairs [i] / [e], [u] / [o], [9] / [a], [+] / [o] and [e] / [H]. However, he too attributes these 
differences to dialectal variation. These descriptions indicate that there is disagreement 
regarding the number of vowels in central Indian Sora. On the other hand, the vowel 
inventory of Assam Sora is yet to be established. Hence, the primary objective of this study 1s 
to provide a description of the vowel inventory of Assam Sora, supported by acoustic 
analysis. 

Prior to the current study, we conducted a pilot field survey to acquire the basic lexicon of 
Assam Sora using the Swadesh list. Vowel data for the present study is also generated from 
the same basic lexicon and from subsequent interviews with native speakers of Assam Sora. 
From both these sources, we could find at least six contrastive vowels in Assam Sora. Some 
of the words that showcase the contrastive vowels were collected during the study and are 
presented in Table 1, transcribed by a trained phonetician. 


Table 1: Near minimal set of words in Assam Sora? 


Vowel| Sora English Initial Meaning | Medial) Meaning | Final Meaning 
[a] zara snake angi battle axe aray sour bagsa| good 
[e] zere red ensi fingerring | arey stone sese | choose 
[i] zipi tooth ipsa | plough shaft | garin | alliance | tusi | push 
[o] Zopo fruit org kind of honey | irog-or Indian Manga kisso | dog 
[ə] aqnam| name ap-na to fan uray bamboo | basa salt 
[u] anum, urine unru heating urum-a bring asu sick 


From the pilot survey we observe that Assam Sora speakers produce six contrastive 
vowels [i e a o u ə]. While all five vowels occur in word-initial, medial and final positions, the 
vowel [o] occurs in word-medial and word-final but rarely in word-initial position (see 
Table 1). Ramamurti (1986) and Anderson and Harrison (2008) also find that the vowel [o] in 
central Indian Sora never occurs in a stressed or an accented position. Considering the 
observed vowel inventory for Assam Sora, in this study we (a) investigate the acoustic 
characteristics of these vowels; and (b) compare the Assam Sora vowel inventory with the 
vowel inventories proposed for central Indian Sora. Additionally, this study reveals that in 
Assam Sora, monosyllabic words are rare and a word is preferably minimally disyllabic?. 
Hence, we would also like to investigate (c) if the vowel characteristics of Assam Sora differ 
depending on their position within a word. 


? AII the words in this table are collected in the current study and many of them did not have correspondences in 
the previous studies on central Indian Sora that have been mentioned in this paper. 

^ We have a database of 1,150 words in Assam Sora. Of these words, 36 are monosyllabic, 524 are disyllabic and 
590 are trisyllabic. 
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2. Materials 


Assam Sora is an undescribed variety of Sora; hence, we first gathered a wordlist from a pilot 
survey. For this study, the wordlist was enhanced after extensive consultation with native 
speakers of Assam Sora. The following subsections describe the materials and methods 
applied in the current study in detail. 


2.1. Participants 


Speech data were recorded at Singrijhan tea estate in Sonitpur district of Assam from seven 
male and five female native Assam Sora speakers, between the ages of 20 and 40 (see 
Appendix 1). The participants are multilingual, as all of them could speak Sadri, and some 
also spoke Assamese and Hindi. These languages are Indo-Aryan, although Sadri is a 
considered creolized language spoken among the communities in the tea gardens of Assam 
(Dey, 2014). While it is observed that there are more Sora speakers in Sibsagar, Dibrugarh, 
Golaghat and Udalguri district of Assam, this study includes only the Sora speech variety of 
Singrijhan tea estate in Sonitpur district. Moreover, although the participants claim that there 
are differences between Sora villages; this is the first study of any Sora variety in Assam. The 
participants also informed that the Assam Sora population in Singrijhan has originated from 
Ganjam district of Orissa. 


2.2. Wordlist 


For the present study 48 Sora disyllabic words in vowel minimal pairs are described (see 
Appendix 2). A set of Assam Sora monosyllabic words showing the full six-way vowel 
contrast has not been found so far. Initial observations also suggest that Assam Sora words are 
minimally disyllabic. Hence, the analysis here does not account for any monosyllables in 
Assam Sora. Since we examined only disyllabic words, we have divided the vowel data into 
three types of disyllabic words (see Table 2). 


Table 2: Word types 


Word Types Examples 
(a) (C)Va.?Va sid, iPi 
(b) | (C)V.CV and CVC.CV | ola, boza, galzi 
(c) (C)V.CVC orar, kakur 


The data used in this study include disyllabic words, as in type (a) that have identical 
vowels in the first and second syllables with an intervening glottal stop [?]. Note that the 
vowel [o] does not occur in type (a) words. Disyllabic words in type (b) have non-identical 
vowels in the first and second syllables and an intervening consonant. Finally, disyllabic 
words in type (c) can have sonorant codas. 


2.3. Data Recording 
Recording was done in the tea gardens where the Sora speakers reside. The participants 
described in 82.1 were asked to produce the Sora equivalents for the prompts provided in 


Sadri. The speakers were requested to produce the words twice in isolation and twice in the 
sentence frame in (1). 
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(1) neni ani gi?laj 
Ist.sg write.pst see.pst 
‘I saw written." 


In order to record words in sentence frames, target words were written in Assamese script 
on white sheets of paper and shown to the participants and they responded by saying the 
words in the sentence frame. The sentence frame in (1) is used so that all target words are 
recorded in a controlled environment. Moreover, since Assam Sora is an unwritten Sora 
variety, the participants prefer to write Sora in a familiar script. Hence, the Assamese script 
was used for creating the Sora stimulus in this study. Only one female speaker, who was not 
proficient in reading, produced the words three times only in isolation. Additionally, Sora 
words that have the final vowel [a] are considered only in isolation and not in the sentence 
frame, since it was difficult to assign a distinct final boundary for such words. Moreover, 
while recording words in sentence frames two speakers replaced the word for the 1st person 
singular pronoun from [nep] to [neni]. Hence, for Sora words with the initial vowel [i] are 
also recorded only in isolation, and not in sentence frames. 

From the 48 disyllabic words with all the iterations across the twelve speakers (excluding 
the discarded ones), a total of 4201 vowel tokens were analysed in the present study. Speech 
data were captured with a Shure unidirectional head-worn microphone connected to a Tascam 
linear PCM recorder via an XLR jack. The sampling frequency was 44.1 kHz, 24 bit in WAV 
format. While recording, respondents sat comfortably in a chair placed in a school building 
that had good number of open doors and windows to reduce the echoing effect. Considering 
the breezy condition at the time of recording, the recorder was set for a low frequency cut at 
40 Hz. 


2.4. Acoustic Analysis 


This analysis of vowels in Assam Sora is primarily based on formant frequency 
measurements. The first three formants (F1, F2, and F3) were extracted at the vowel midpoint 
and formant values were auto-generated using a script for Praat 5.3 (Boersma & Weenink 
2015). Mel transformed values were also auto-generated through the same Praat script and 
normalized for speaker effect using the Lobanov normalization method in NORM (Kendall & 
Thomas 2007). 

Since the list of words in our analysis consisted of disyllables, we analysed the vowel 
qualities of the first and the second syllables of the disyllables separately. Apart from vowel 
quality, we also decided to investigate if the two syllables in a disyllabic word in Assam Sora 
differ in terms of vowel duration, fundamental frequency (fo) and intensity — to see if there is a 
difference in prominence between the two syllables. 

The recorded sentences with the target words (see Appendix 2 for the list) were manually 
annotated and special care was taken to mark the beginning and the end of a vowel. The 
beginning and end of steady state formants were considered boundaries for vowels. Vowel 
duration, fo and vowel intensity values were calculated automatically using a Praat script and 
they were exported to a spreadsheet. The values in the spreadsheet were later used for plots 
and statistical analyses. 


5 As far as we could analyze, the alteration between [nen] and [neni] does not have any grammatical 
implications. 
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3. Results 


Previous descriptions of central Indian Sora report three different vowel inventories for the 
language. Ramamurti (1986) reports a 6-vowel inventory; Anderson and Harrison (2008) 
report an 8-vowel inventory and Donegan (1993); Donegan and Stampe (2002) and Stampe 
(1965) report a 9-vowel inventory for central Indian Sora. Ramamurti (1986) also reports a 
number of allophones for the 6-vowel inventory of central Indian Sora as shown in Table 3. 


Table 3: Central Indian Sora vowels and their allophones (Ramamurti, 1986) 


Vowel Allophones 
Wi [11:11] 

/e/ [e e:] 

/al [a a:] 

/o/ [o o: 6] 

/u/ [u u: 0 o: ü] 
/9/ [9] 


While our analysis of Assam Sora vowel inventory resembles Ramamurti's vowel 
inventory of central Indian Sora, it is found that the two vowels /i/ and Ju! that appear in 
Anderson and Harrison (2011) do not occur in Assam Sora. Words that have the vowel A/ in 
Anderson and Harrison (2011) are produced as [i], [ə] or [e] in Assam Sora. Again, words that 
are transcribed with /o/ in Anderson and Harrison (2011) are produced as [u] or [a] by Assam 
Sora speakers (see Table 4). On the other hand, the nine-vowel inventory of Stampe (1965), 
Donegan (1993) and Donegan and Stampe (2002) could not be verified since their 
descriptions are not supported by sufficient acoustic information. 


Table 4: Central Indian Sora and Assam Sora vowel comparisons? 


Vowel| Central Indian Sora | Assam Sora | Meaning 
i~i idza Idea never 
adil adil dirty 
i-o animnam anamnam name 
dzumbirna zumbarna steal 
i~e anindun anenduy roam 
pilisida pilesida uproot 
0-u koduple kuduple all 
0-8 buysa bansa fine 


Hence, in order to substantiate our observations with acoustic evidence, we conducted an 
instrumental analysis of the vowels of Assam Sora. The following subsections describe in 
detail our findings and support our claim that Assam Sora has a six-vowel system. We present 
our findings from the analysis of disyllabic words of Assam Sora in the subsequent sections. 


$ We are thankful to Dr. David K. Harrison and Dr. Gregory D. S. Anderson for providing us with the CIS 
database. 


74 


5. Acoustic analysis of vowels in Assam Sora 


3.1. Formant Frequency 


The average F1, F2 and F3 frequencies of the six Assam Sora vowels and their standard 
deviations in parenthesis are shown in Table 5. The vowel plot in Figure 3 is drawn from the 
average Fl and F2 of all vowel tokens as produced by the 12 Assam Sora participants. 
Considering that average formant frequencies also include speaker intrinsic features, the 
Lobanov normalization method was used to normalize for speaker effects. Figure 4 shows the 
acoustic vowel diagram of Assam Sora vowels with F1 and F2 normalized by Lobanov 
normalization method using NORM (Kendall & Thomas 2007). 


Table 5: Average formant frequencies (Mel) with standard deviation 


1 e 9 a 0 U 
Fl 273.44 358.38 333.71 468.59 403.34 327.25 
(SD) (36.70) (47.82) (46.13) (70.09) (49.25) (45.15) 
F2 923.18 868.35 786.13 743.42 614.34 596.61 
(SD) | (54.89) (47.14) (70.42) (61.72) (67.14) (94.98) 
F3 1040.24 994.49 980.36 970.48 988.68 987.91 
(SD) | (35.58) (38.06) (42.38) (41.91) (41.20) (41.78) 
250 
300 
= 350 
Q 
Z 
= 400 
E 
450 
500 
950 750 550 


F2 (Mel) 


Figure 3: Average F1 and F2 (non-normalized) of Assam Sora vowels 
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[0] 


Figure 4: Lobanov normalized vowels of Assam Sora with one standard deviation ellipses 


This analysis supports our initial observation that Assam Sora has a six-vowel inventory. 
The vowel plots in Figure 3 and Figure 4 indicate a six-vowel system in Assam Sora with five 
peripheral vowels /i, e, o, a, u/ and one non-peripheral vowel /o/. Individual non-normalized 
formant plots (see Appendix 5) for the 12 speakers of Assam Sora are consistent with the 
plots in Figure 3 and Figure 4. 

Hence, while the vowel inventory of Assam Sora presented here resembles the one 
presented by Ramamurti (1986), it differs from the eight and nine-vowel systems reported by 
Anderson and Harrison (2008), Stampe (1965), Donegan (1993), and Donegan and Stampe 
(2002). In order to verify the distinctiveness of the vowels on the basis of their formant 
frequencies, a one-way ANOVA test was conducted for all probable vowel pairs in Assam 
Sora. The results are summarized in a matrix in Table 6. 


Table 6: Significance matrix for formant frequencies 


F2 
Vowels olaleļəlilo 
e * 
9 * * 
1 * * * 
o * * * 
u * * * * * * 


From the matrix it is observed that, the six vowels in Assam Sora differ significantly with 
respect to their F2 values. This indicates a distinct front, central and back vowel positioning 
for all the vowels in Assam Sora. On the other hand, F1 values show that except for [o] and 
[u], all the vowels have significantly different F1 values. This indicates that the non- 
peripheral vowel [o] in Assam Sora has a relative height equal to the back peripheral vowel 
[u] (also seen in the vowel plot in Figure 3 and Figure 4). However, the other vowel pairs 
show a clear distinction in terms of their heights. 
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Since Assam Sora has minimally disyllabic words, we noticed some differences in the 
formant frequencies between the two syllables. This motivated us to look at the vowel 
qualities in the two syllables separately. In this and the following subsections, we investigate 
the vowel quality, duration, fo and intensity separately in order to see (a) if these phonetic 
properties are significantly different depending on the type of vowel and (b) if the differences 
in the phonetic properties of the vowels in the two syllables can provide any clue regarding 
the word level stress of Assam Sora. 

In order to see if there is a difference in the vowel space in the two syllables, we examined 
the vowel plots of the first and second syllables of disyllabic words. Figure 5 shows an F1-F2 
plot for all vowel tokens, in the first and the second syllable of disyllabic words, as produced 
by the twelve Assam Sora speakers. 


-—*--Syllable 1 —s— Syllable 2 


250 
300 
=> 350 
z 
ia 400 
E 
450 
500 
950 750 550 


F2 (Mel) 


Figure 5: Average F1 and F2 (non-normalized) of first and second syllable in Assam Sora 


The vowel plot in Figure 5 shows that the vowel space in the first syllable is narrower 
than the vowel space in the second syllable. To examine the extent of the change of vowel 
space, we calculated the Euclidean distance between vowels in the first and second syllable 
from their formant frequencies. The relative Euclidian distance between first and second 
syllable is given in Table 7. We also subjected the normalized F1 and F2 values for each of 
the vowels to a one way ANOVA test, with normalized F1 and F2 as dependent variable and 
syllable place (first or second) as factor. The ANOVA tests revealed that all the formant 
values, except the F2 value for [i], are significantly different in the first and the second 
syllables. The results of the ANOVA tests are provided in Appendix 4. 


Table 7: Average Euclidean distance between vowels formants in first and second syllable 


Vowels | Euclidean Distance 
i 18.07 

28.74 

33.09 

42.48 

43.33 

56.63 


C joj» cio 
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The plot in Figure 5 and the data showing a significant formant value difference between 
the first and second syllables both confirm that the vowels in the first syllables are more 
centralized. We therefore believe that vowels in the second syllable are more representative of 
the canonical vowel space in Assam Sora. 


3.2. Acoustic properties of Assam Sora disyllables 


As mentioned in the sections above, we see a significant difference between the vowel quality 
of Assam Sora vowels in the first and the second syllable. This prompted us to investigate if 
such differences can also be seen in other acoustic properties, resulting in a difference in word 
stress in the syllables. Hence, in the following subsections we investigate differences between 
two syllables in terms of temporal and spectral features. 


3.2.1. Vowel Duration 


While some Austroasiatic languages like Bunong and Kammu, are reported to have a 
contrastive vowel length distinction (Jenny et al. 2014), North Munda languages like Santali 
(Ghosh 2008), Mundari (Osada 2008) and Kera? Mundari (Kobayashi & Murmu 2008) are 
not known to have phonological vowel length. Some South Munda languages like Gutob 
(Griffiths 2008) and central Indian Sora (Anderson & Harrison 2008) are reported to have 
phonemic vowel length. However, as mentioned in $1, phonemic vowel length difference in 
central Indian Sora needs further investigation. 

In Assam Sora, vowel duration is not found to be distinctive. Hence, in this section we 
examine vowel duration to see if there is any difference between the duration of the vowels in 
the first and second syllables in the disyllabic words of Assam Sora. Figure 6 shows the 
average vowel duration in the first and second syllable for three types of disyllabic words 
examined in this work (see Table 2). The average vowel duration distinction reveals that, in 
Sora disyllabic words, vowels in the second syllables are longer than in the first syllables. 


(C)V.CV-Identical G(C)V.CV-Nonldentical &V.CVC 


SS 


WU 


Duration (Ms) 


0 L SS : un 


Syllable1 Syllable2 


Figure 6: Average vowel duration in first and second syllable with standar deviation as error bars 
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Figure 6 shows that for each of the three vowel types, the vowels in the second syllable 
are longer than the vowels in the first syllable. In order to confirm the durational differences, 
we conducted a one-way ANOVA on the vowels in (C)V.CV and V.CVC word types 
separately. For both vowel types we considered duration as dependent variable and syllable 
location (first or second) as factor. For both word types, we found that the duration of the 
vowels in the second syllable is significantly longer than the duration in the first syllable (for 
(C)V.CV, F(1,2838) = 2745.42, p « 0.001, for V.CVC, F(1,521) = 8.39, p « 0.05). It is 
noteworthy that in V.CVC words, even though the vowel in the second syllable is in a closed 
syllable, the vowel in the first syllable is still significantly shorter than the vowel in the 
second syllable. 


3.2.2. Fundamental Frequency 


In order to examine if there is a difference in fo between the first and second syllables, we 
extracted the average fo values of the vowels in each of the syllables. However, in order to 
minimize gender influences on fo, we normalized the values using the z-score normalization 
method proposed by Lobanov (1971). Figure 7 presents the average z-score values for fo in 
two syllables of disyllabic words. The normalized values clearly show that fo is significantly 
lower in the initial syllables. A one-way ANOVA test conducted with normalized mean f? as 
dependent variable and syllable position as factor showed a significant difference between the 
fo of the first and the second syllable [F(1, 3291) = 388.88, p « 0.001]. We also compared the 
maximum f? in the two syllables of disyllabic words. Figure 8 presents the maximum fo, by 
speaker, for the first and second syllable. Here we can see that for all speakers, the maximum 
fois higher in the second syllable than in the first. A one-way ANOVA test conducted on 
normalized maximum fo values indicated that in terms of maximum fo, the first and the second 
syllables in a disyllabic word are significantly different [F(1, 3291) = 616.24, p « 0.001], with 
the maximum fo in the second syllables always higher. 


HSyllablel &Syllable 2 


z-score F0 
oS 


-0.4 


Figure 7: Speaker normalized average f» in first and second syllables 
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OSyllable 1 GSyllable 2 


Maximum F0 


Figure 8: Maximum fo in first and second syllables for all speakers 


The analyses in this section clearly demonstrate that in terms of average fo and maximum 
fo across the vowel, the two syllables of disyllabic words in Assam Sora are significantly 
different. The first syllable has statistically significant lower fo and maximum fo, than the 
second syllable. 


3.2.3. Vowel Intensity 


Finally, we analyzed the intensity of vowels across the two syllables of a disyllabic word in 
Assam Sora. Figure 9 shows vowel intensity of the first and second syllable for the three 
types of Sora disyllabic words. The figure shows that vowel intensity minimally differs 
between the first and second syllable for each of the three word types. However, a one-way 
ANOVA test conducted with average intensity as dependent variable and syllable position as 
factor showed statistically significant difference in average intensity between the first and the 
second syllables [F(1, 3361) = 51.06, p < 0.001]. A follow-up univariate ANOVA test showed 
that there is a significant effect of speaker on the average intensity across the two syllables. 
The test with Speaker X Syllable type X Position in word (first or second syllable) as factors 
and average intensity as variable showed a significant interaction [F(22, 3291) = 4.52, p < 
0.001]. The results indicate that across speakers, average intensity patterns for the two 
syllables differ. Therefore, in terms of average intensity difference in two syllables of a 
disyllabic word, there is no consistent pattern noticed across all speakers. 

We also compared the maximum vowel intensity between the first and second syllables 
for Sora disyllabic words and found that maximum vowel intensity in both the syllables does 
not differ significantly (see Figure 9). A one-way ANOVA test conducted with maximum 
intensity as dependent variable and syllable position as factor failed to show any statistically 
significant difference between the first and the second syllables in terms of maximum 
intensity [F(1, 3361) = 00.38, p > 0.001]. 
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90 Syllable 1 ti Syllable 2 
85 

za 

© 80 

& 

£ 75 

= 

2 

= 

^7 70 
65 


Average of Intensity Average of Max Intensity 


Figure 9: Average vowel intensity and maximum intensity in first and second syllables 


The analysis in this section shows that in terms of average intensity, the two syllables in 
Assam Sora disyllables remain distinct, however the distinction is not systematic across all 
speakers. In terms of maximum intensity, there is no difference between the two syllables. 


4. Discussion 


This study reports the results of an acoustic study conducted on Assam Sora vowels and 
proposes a vowel inventory for the variety. Secondly, it also compares the vowel inventory 
with the vowel inventories proposed for Sora as spoken in central India. Additionally, this 
study reports acoustic differences between vowels in the first and second syllables of Assam 
Sora disyllabic words. 

Contrary to the reports of central Indian Sora having a nine-vowel inventory, the vowel 
data analysed for the current study shows that Assam Sora has a six-vowel system. The 
limited data that we examined in this study, does not show evidence for the three vowels /e/, 
// and /9/, attested in central Indian Sora (Stampe 1965). In a recent study by the authors of 
this paper, 1,825 words from Assam Sora were subjected to acoustic analysis and the absence 
of the three vowels was confirmed (Horo & Sarmah 2015). 

As far as the motivations for the organization of vowel systems are concerned, Lindblom 
(1986) mentions that the vowel systems of a language often closely resemble the systems of 
other languages in the same subfamily. He also mentions that the vowel system of a language 
is influenced by the vowel systems of languages spoken in its vicinity. Austroasiatic 
languages usually have a large vowel inventory (Jenny et al. 2014). However, the vowel 
inventory of the languages in the Munda subgroup is generally smaller. Jenny et al. (2014) 
agree that Mundari, Kera?, Korku, Kharia, Sora and Gutob all have a five-vowel inventory. 
Additionally, Anderson and Rau (2008) report that Gorum also has five vowels in its vowel 
inventory. Given that Sora is a language of the Munda subfamily, our conclusion that Assam 
Sora has a six-vowel inventory seems more tenable. Apart from that it is also worth 
mentioning that the Bodo-Garo languages spoken in Assam also have six-vowel systems 
similar to the Assam Sora system (Burling 2013, Sarmah et al. 2015, Sarmah & Redmon 
2013). However, as there is very little contact between these linguistic groups, we are not sure 
about influence of the Bodo-Garo languages on Assam Sora vowels. 

Typologically speaking, languages with five or six-vowel systems like Spanish, Greek, 
and Maori are more common in world’s language (Liljencrants & Lindblom 1972). On the 
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other hand, languages with nine or more vowels like Turkish, Thai, and English etc. are 
considered marked. Hence, if central Indian Sora actually has a nine-vowel system as reported 
in Stampe (1965) and Donegan (1993), it would seem that the Assam Sora vowel system is 
moving towards being a typologically unmarked vowel system. At the same time, it should be 
noted that Kharia (Peterson 2008) and Gorum (Anderson & Rau 2008), the closest sister 
languages of Sora, each have a five-vowel inventory. While this study confirms that there are 
only six vowels in Assam Sora vowel inventory, any claims inferring vowel reduction in 
Assam needs thorough synchronic and diachronic investigation into the vowel inventory of 
central Indian Sora. 

Finally, we have found that the vowel space in initial syllables in Assam Sora is reduced. 
We also noticed that the average fo and maximum fo of the second syllables is higher. In 
addition, it was confirmed that the vowel duration in the second syllable is greater than that in 
the first syllable. Considering this, we see a possibility that the second syllable is stressed in a 
disyllabic word in Assam Sora characterized by greater pitch, longer duration and by change 
in vowel quality, all of which are considered to be some of the acoustic correlates of stress by 
Ashby and Maidment (2005). In the case of Assam Sora, we notice that the second syllable 
displays higher fo and duration of the vowel that suggest greater prominence. However, in 
order to confirm such a suggestion definitively, perception tests based on these acoustic 
findings is necessary. 
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Appendix 1 


Sora population in India (Source: Census of India, 2001) 


States Population States Population 
Orissa 172,288 | Maharashtra 112 
Andhra Pradesh — 74,788 Meghalaya Tu 

Tripura 2,155 Karnataka 28 

West Bengal 1,696 Madhya Pradesh/13 
Jharkhand 639 Bihar 4 

Assam 406 Rajasthan 2 
Arunachal Pradesh| 174 Uttar Pradesh |1 
Chhattisgarh 135 Haryana 1 
Appendix 2 


Assam Sora wordlist 


Sora English Sora English 
1 iri louse 25 | kani this 
2 uru hair 26 | kuni that 
3 |da?a water 27 |kena tiger 
4 | doroy body 28 | kina sing 
3 | luru ear 29 | ajo fish 
6 | moro eye 30 |saju cold 
7 | muru nose 31 |tagi hot 
8 |rara elephant 32 |togi fire 
9 [sie hand 33 |zali long 
10 | so?o rotten foul smell 34 |zile over-boiled rice 
11 | to?o mouth 35  zelu meat 
12 |za?a snake 36 dla tongue 
13 |ze?e red 37 |alug in 
14 |zi?i tooth 38 |ala tail 
15 |zo?o fruit 39 "Form blind 
16 | boza sew 40 | kura earthen oven 
17 | buza suck finger 4] drer ice 
18 | baza spit out 42 orar earth 
19 | basa salt 43 | anam name 
20 | galzi ten 44 | anum urine 
2] | gulzi seven 45 | kakiņ elder sister 
22 | gada cut like stab 46 | kakuy elder brother 
23 | goda rub/clean 47 | manam blood 
24 | guda scratch 48 | punam warm 


84 


Appendix 3 


5. Acoustic analysis of vowels in Assam Sora 


Background information on Sora speakers in the study 


Name | Age | Gender | Education Languages Known 

BS) 25 M Graduate Sora, Sadri, Assamese, Hindi 
BS2 | 40 M None Sora and Sadri 

BS3 |40 F None Sora, Sadri and Assamese 

CS 38 M 10 Sora, Sadri, Assamese and Hindi 
JS1 25 M 10+2 Sora, Sadri, Assamese and Hindi 
JS2 23 M 10+2 Sora, Sadri, Assamese and Hindi 
LSI 25 F 10+2 Sora, Sadri and Assamese 

LS2 | 35 F 10 Sora, Sadri and Assamese 

NS 21 F 10 Sora, Sadri 

PS 38 M None Sora and Sadri 

SS 25 M 10+2 Sora, Sadri, Assamese and Hindi 
TS 24 F 10+2 Sora, Sadri and Assamese 
Appendix 4 


Results of one-way ANOVA test conducted on normalized F1 and F2 values in the two syllables 


Vowels F1 F2 
i F (1, 686) = 52.35, p < 0.001 F (1, 686) = 0.72, p > 0.001 
ə F (1, 242) = 47.41, p < 0.001 F (1, 242) = 4.64, p < 0.05 
u F (1, 750) = 31.76, p < 0.001 F (1, 750) = 21.84, p < 0.001 
a F (1, 1556) = 153.62, p < 0.001 | F (1, 1556) = 123.05, p < 0.001 
o F(1,711)= 83.63,p< 0.001 | F(1, 711) = 7827, p < 0.001 
e F(1,228) = 238.45, p «0.001 | F(1, 228) = 17.93, p < 0.001 
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Appendix 5 


Individual vowel plots for the twelve Assam Sora speakers 


Vowels (BS1) 
F2 (Hz) 


Vowels (JS1) 


F2 (Hz) 
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6. Differential marking of arguments in War 


6. Differential marking of arguments and the grammaticalisation 
of subjectivity values in War! 


Anne Daladier 
LACITO CNRS France 


Abstract War, like Pnar and Lyngam, but unlike Khasi, has developed a conservative isolating AA morphology of 
particles and grammaticalized elements which interact among themselves according to a hierarchy of an 
Assertive Dependency System (ADS). They also interact with lexical elements, this interaction being the 
core of War grammar. ADS has no cline toward a conventional verbal system, especially no kind of 
argument(s)-verb agreement. Instead, War has an optional differential marking highlighting core 
arguments for agentive and beneficiary roles with focused values of agentive unexpectation for di and 
beneficial empathy for ha? referring to the subjectivity of interlocutors or animate core arguments. This 
differential marking is the lowest sub-system of ADS. Usually core arguments are unmarked even in di- 
transitive constructions. The differential marking adds some kind of secondary assertive value to a higher 
assertive particle (in initial position of the utterance): with ha? some request of empathy with the 
interlocutor or an animate core argument, with di some request to pay attention. 


This differential marking is not a DAM and DOM as both ha? and di may apply to any core-argument 
depending on constructions and especially on the lexical choice of their co-argument(s). 


This differential marking bears subjective values referring to interlocutors, like many other levels of the 
ADS of War. The grammar of war might be said pragmatic or syntactically under-specified at first glance 
but it requires another look as the interaction between lexical and ADS elements is highly constrained 
providing interesting graded grammaticalized values. 


The ADS of War let us perceive many aspects of the genesis of verbal systems in AA. 


Citation Daladier, Anne. 2015. Differential marking of arguments and the grammaticalisation of subjectivity values in War. 
North East Indian Linguistics 7, 91-110. Canberra, Australian National University: Asia-Pacific Linguistics Open 
Access. 


Volume Editors Linda Konnerth, Stephen Morey, Priyankoo Sarmah, Amos Teo 
Copyright © 2015, the author(s), release under Creative Commons Attribution license 
URL http://hdl.handle.net/1885/95392 


1. Introduction 


War, a Mon-Khmer language spoken in Meghalaya still has many productive poly-functional 
isolating features, with a few affixes, most of them still productive and no kind of verbal 
basis, especially no kind of verb-argument(s) agreement. Instead it has a differential marking 
of arguments with two particles adding values of un-expectation and empathy respectively to 
any core-argument depending on constructions. 

As opposed to War, Pnar and Lyngam, Khasi has developed a subject-verb agreement 
with a clitic, referring to the subject, pre-posed to the verb and suffixation of reduced particles 
to this clitic expressing aspectual values. Austroasiatic languages (AA) have different kinds of 
argument agreement systems or no kind of agreement at all, either because this feature 
together with its morphology has disappeared, for example in some Bahnaric languages like 
Stieng, see Bon (2014), or because such a system is still not occurring, which is the case in 
core varieties of War, Lyngam and Pnar. Usually Munda languages have verbal bases with 
clitics referring to one or to several arguments according to a referential hierarchy, see the 


! I am much indebted to all my War friends and consultants for teaching me War in the context of their oral 
literature and in ordinary conversations in village life, most especially to Woh Thakur Pohtam Tean, Woh 
Monti Pohtam Cherniah, Babu Lakhmie Pohtam Sohsley in Kudeng War, {Muh Marlyda Pohleng in Amwi War, 
Woh Khundep and Woh Tyiu in Nongbareh War and Woh Khylei in Nongtalang War. Most of the examples are 
taken from Kudeng War in an unpublished corpus of tales and rituals collected in the main dialects of War. 
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authors of the different chapters in Anderson ed. (2008). Most Mon-Khmer languages (MK) 
have verbal bases with clitics expressing subject-verb agreement inside derivational 
morphologies. In O. Khmer grounding features involving both some kind of aspect and some 
kind of subjectivity features referring to the speaker may be expressed as prefixes, as shown 
in details with much subtlety by Jenner and Pou (1981). 

An overview of War among the MK languages of Meghalaya, which I define as Pnaric- 
War-Lyngam (PWL) rather than Khasian is proposed in Daladier (2011) (together with broad 
typological features of War, Pnar, Khasi and Lyngam). The recent separate migrations of the 
Pnars, Khasis, Wars and Lyngams in Meghalaya is described by Shadap-Sen (1981). The Pnar 
kingdom took refuge in East Meghalaya with their capital in Sutunga in the 15th century A.D. 
when the Ahom kingdom settled in Assam and took Nowgong, the former capital of the 
Pnars. From different places in Bangladesh, small groups of War and Lyngam 
agriculturists are still taking refuge in Meghalaya, joining their clan mates. Very recently 
Khasi has become the main lingua franca in the context of the post-colonial Khasi 
constitution. Due to its political dominance, especially schooling in Khasi, Khasi has a quick 
pervasive influence on Pnar, War and Lyngam resulting in many mixed varieties. Detailed 
linguistic maps can be found on the web site of the LACITO, CNRS?. 

In War, all core arguments are basically inserted directly as shown in $2. They also may 
optionally be marked with two particles. di is used for unexpected agentive roles, ha? mainly 
for a chosen animate beneficiary, for a favourite thing, or for a victim of an unwilling 
detrimental action. ha? adds some empathy value toward the interlocutor or toward an 
animate argument, according to War conceptions of empathy, disregard and politeness and 
depends on the lexical choice of the verb. It has also faded to some superlative value for an 
inanimate second argument. di marks an unexpected agency for any core argument taking an 
agentive value in its construction. 

As we shall see, the conventional syntactic notions of subject, object and dative, or any 
case notion, are irrelevant for this differential marking, as it may apply to any core arguments 
depending on constructions where they can get a focused agentive or beneficial value. This is 
detailed in 83, 84 and $5. 

This kind of marking is lexically but also grammatically constrained (restrained to core 
arguments) and should not be confused with focus discursive operations, which can apply to 
any argument, adjunct or adverbial element and may even further apply to those marked 
arguments, as will be shown in $3 and $4. 

In 82, I will take up questions introduced by LaPolla (1993) and taken up by Bisang 
(2008) on relevant grammatical categories in what may be called topic-prominent languages. 
LaPolla shows why the notions of “subject” and “object” are not relevant in modern Chinese. 
The usual syntactic notions of “subject” and “object” with their agent and patient semantic 
distinction are not the most relevant either in War with its differential marking of arguments 
which may apply on both arguments. However, lexical elements having argument(s) have an 
argument structure or valency which depends on their lexical selection restrictions and on 
further grammaticalized valency changing elements in the construction. For example, (a) 
children, she likes and (b) she likes children have here the same ordered argument structure: 
likes (she, children) as opposed to: (c) children like her structured as: like (children, her). The 
verb like becomes intransitive in the construction (d): (d) Blonds are much liked here. I will 
not take up the hypothesis concerning languages having some kind of under-developed syntax 
and over-developed pragmatics as it underlies a universal conception of syntax based on part 
of speech morphologies, syntagmatic tree structures and a separation between relevant 
categories for syntax and for pragmatics. Instead, I use syntactic categories without any 


? http://lacito.vjf.cnrs.fr/ 
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semantic reference: first, second and third argument for predicative verbals and nominals and 
letters for grammatical functions in ADS, which may also be predicative, see $2. I aim at 
inducing how they construct a structured grammar combining lexical and grammatical 
meanings. War morphology does not induce part-of-speech categories but its grammatical 
morphology (particles, affixes, grammaticalizations) let us enter another kind of structured 
grammar. Grammatical elements design layers of what I term an “assertive dependency 
system" (ADS) where the optional differential marking of arguments is the last layer. This 
ADS system is sketched in $7. 

I will relate the optional marking of arguments as a way to highlight arguments to the fact 
that War has no conventional morpho-syntactic foregrounding of arguments, that is, no 
argument-verb agreement and no voice but instead two kinds of tightly constrained 
mechanisms foregrounding arguments. One is this differential marking of arguments and the 
second one is the use of a very rich set of transitiving and intransitiving auxiliaries and two 
causative prefixes which change valency, analyzed in Daladier (2012). 

I use “salience” (SAL in glosses) as an abbreviation for “lexically constrained optional 
marking of (core) arguments". I use “focus” for discursively foregrounded elements, that is, 
forms changing word-order lexically and grammatically unconstrained. 

I show in Sp that ha? and di are poly-functional particles and I will argue that this poly- 
functionality designs the cline of several grammaticalisations, from AA deictics to grounding 
particles in War. 

Before concluding, I will argue in $8 that the salient marking of core arguments for 
agentive and beneficiary values, is a conservative AA feature. I analyze as its vestiges, 
markings having agentive and beneficiary highlighting values affixed on arguments in 
addition to clitics referring to them in the verbal basis, in Semelai (Aslian) and in Kharia 
(Munda) quite different verbal systems. 


2. Overview of War grammatical system and terminology precisions 


Insertion of clitics or flexions referring to argument(s) in a verbal basis may be obligatory or 
partially optional according to languages. In English, verb agreement with the subject is 
obligatory and still corresponds to some foregrounding. In Old English as in Old French the 
SVO word order was strongly focal for the subject. Passive foregrounds the argument used as 
object in the active voice and displays morpho-syntactic features of valency change and 
‘subject’ agreement as in: 


(1) John is leaving Mary. 
(2 Mary has been left. 


Passive also involves a grammaticalized value of affectedness with selection restrictions 
on the lexical verb, as seen by the different status of: John left Mary/ America and Mary/ 
?? America was left by John. 

In many languages, finite tense and other TAM flexions or affixes have a declarative 
function, as opposed to infinitive or participial verbal forms. Finite / non-finite verbal 
morphology is not a relevant opposition in War because tense is not expressed verbally or 
with declarative markers. Some kind of temporal notion is finely encoded in temporal deictics 
e.g. seven elements differentiate the subjective meanings of ‘now’ and in some temporal 
conjunctive uses of grounding aspectual particles. The grounding particles /a? ‘DISTANT 
POTENTIAL’ and day ‘PROGRESSIVE’ (example (3)) when suffixed with -nja provide a kind of 
future ‘when’ /a?nja and a kind of past ‘when’ dannja. These two ‘when’ can be conjunctive 
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or interrogative. -nja, is also used to form interrogative and indefinite pronouns when suffixed 
to personal pronouns and distal deictics. 

There is no realis/ irrealis opposition marking in War because there is no "realis" as a 
declarative means of assertion. Conditional markers or negative wishes do not induce any 
kind of irrealis marking in their dependant clause. 

An utterance is usually marked initially with a grounding particle from the subjective 
viewpoint of the speaker (or his subject). Unmarked utterances are three kinds: non polar 
questions, imperative utterances and utterances which state factual information headed by 
deictic or possessive relationships, as shown below. 

In English, a simple noun like child cannot be used directly as a predicate, but it can be 
associated with a copula bearing TAM and subject agreement grounding features in a stative 
construction like: He is a child. In this example, is a child constructs intransitively with the 
argument he under grounding features agglutinated to the copula. The copula is not a lexical 
predicate; it is only the bearer of grounding features; those grounding features induce the 
interpretation of child as a state requiring an argument. In War, where the distinction noun/ 
verb is not a morphological one, grounding particles may construct directly with simple nouns 
inducing intransitive valency with a stative interpretation of these nouns combined with their 
own values as in (3), (4), (5): 


(3) dag hambo U) 
PROG child 3MS 
‘He [is]? still [a] child. 


(4) 9 Yerkjag ‘u. 
DCL elder 3MS 
‘He [is an] elder.’ 


(5) da Yerkjayn d 
PRF elder 3MS 
‘He [has] already [reached the stage of being an] elder.’ 


Many lexical elements may be used nominally or verbally or rather assertively or not, like 
bua ‘eat’ transitive constructed verbally with a ‘DCL’ weak commitment declarative marker in 
(6a). Utterances headed by a express some commitment of the speaker with an actualized 
value that has to be translated either as a past or as a present in English because there is no 
other way to actualize an utterance in English. bua can construct in (6b) both as a verb in the 
first occurrence and then as a simple noun: "i bua ‘food, sweet meat’ with "; functioning like a 
mass term article when pre-posed to a nominal use and 7 functioning as a third person plural 
pronoun when post-posed to a verbal use as in (6b): 


(6a) a bua rei "L 
DCL eat rice 1P 
‘we eat/ have eaten our meal.’ 


(6b) de bua ’i gi bua ka. 
PRF eat Ip MASS sweet 3rs 
‘We have eaten her sweets.’ 


? | use brackets for grammatical information which has to be added in the English translation of the examples in 
War and / or for alternatives when the exact meaning cannot be rendered in English. 
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A discursive focus in War, as in many other languages, consists in permuting any adjunct, 
adverbial, non finite verb or argument, before the initial position of the utterance: with a co- 
referential pronoun in the usual position of the permuted noun as 'u ‘3MS’ in (7a) or with a 
negative grammaticalized verb of knowledge, kind of copula in (7b). These focalisations are 
not lexically or grammatically constrained, hence the term “discursive”: 


(7a erau a bua tzi U) 
3MSsr DCL eat rice 3MS 
*Him, he eats / has eaten (his) food." 


(7b) toP=ta fa Duki | do Jip d 
COP-not DIRs Duki  PRF die 3MS 
‘It is not in Duki that he died’. 


Marked arguments can also be further discursively focused for more emphasis, see (13). 

In War, one asserts a clause with different kinds of particles expressing illocutionary 
forces (e.g. declarative) with some commitment of the speaker, positive or negative aspectual 
values, or subjective values, like agreement or empathy toward the interlocutor and different 
kinds of requesting values: injunctive, hortative, precative and interrogative or mirative. 
Here, the term ‘assertive’ includes all these “forces”. Usually utterances have an assertive 
marker in initial position or a combination of such markers. Declarative utterances without 
initial assertive particle may be headed by a deictic relationship as in (9) or by possessive 
relationships as in (8) or (10). Declarative utterances without assertive particles express 
factual information, information which does not rely on the subjectivity of the speaker. Rather 
than a lexical predication they are constructed on possessive predications as in (8), (10) or on 
deictic predications as in (9): 


(8) ?u hun ko Mersi — ?u Lumlang. 
3MS child 3FS Mercy 3MS  Lumlang 
Lumlang [is] the son of Mercy. 


(9) *u-na Zu Lumlang. 
3MS—PROX  3MS Lumlang 
‘This one [is] Lumlang? 


(10) tzat Sumer U) 
specie Sumer 3MS 
‘He [belongs to] the Sumer clan.’ 


While War has no verbal bases, it has deictic bases with a rich set of space and directional 
deictics combined with referential elements. For example with ka ‘she, the (feminine)’ ka=na 
‘this one feminine near’; ka=ta ‘this one feminine far in view’; ka=tun ‘this one feminine 
very far, still in view’. These deictic bases may be used assertively as in (9). 

When pre-posed to a lexical element used as a noun, third person pronouns function as 
kind of gender-number classifiers which may be interpreted as indefinite or definite articles 
depending on constructions. For example, they are used with proper names of persons, as in 
(8) and (9). 

In a way which I relate to the inexistence of finite/ non finite grounding opposition, War 
has a marking of clause dependency usually without conjunctions. Conjunctive words are 
mostly borrowed from Pnar. Clause dependency is usually marked by correlating grounding 
particles in clauses. There is no kind of complementizers and no relative pronouns in War (as 
opposed to Pnar). 
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Part of speech (syntagmatic) categories are unfit for War because they do not account for 
other more relevant features together with their grammatical information which are 
morphologically marked, together with their own oppositions. For example, as shown in the 
data of this section, there is some apparently relevant nominal/ verbal opposition, which 
especially enables a distinction between the uses of third person pronouns either as personal 
pronouns with verbal uses or as gender/number classifiers and possessive pronouns with 
nominal uses. Verbal uses of lexemes, never nominal uses, construct with declarative particles 
but as shown in (8), (9) and (10), there are also non verbal utterances and further, there are 
verbal utterances without grounding particles with implicit imperative or question forces 
marked by intonation. It is easier and much more precise to define utterances in terms of ADS 
and lexical elements. Lexical elements are classified into predicates (or rather operator which 
is less ambiguous than the notion of predicate with its ordinary subject-predicate structure 
including the assertive morphology) or elementary elements. 

Depending on constructions, the valency of a lexical operator may change, for example in 
construction with a transitiving or di-transitiving causative prefix or with transitiving or 
intransitiving auxiliaries, see Daladier (2012). However, before insertion of a lexical operator 
in a given context, the knowledge of its lexical valency is necessary to account for different 
kinds of grammatical ellipsis. Grammatical reconstructions have to be defined. For example, 
the reconstruction of the first argument in (6c) and (6d) relies on the assertive force of the 
utterance: in (6c) the combination of perfective markers do dep 'already done' belongs to 
declarative grounding forces, while in (6d) the same combination is inserted under an 
interrogative intonation, hence the zeroed first argument refers to the speaker in (6c) while it 
refers to the interlocutor in (6d): 


(6c) da dep | bua tsi G 
PRE  CMP eat rice IS 
‘I have already eaten’ 


(6d) da dep | bua tsi gh? 
PRE CMP eat rice 2s 
“You have already eaten?’ 


Stability of lexical valency, unless a grammatical operation modifies explicitly it, like 
transitiving/ intransitiving auxiliations, is necessary in all languages to account for different 
kinds of grammatical ellipsis. These ellipses should not be confused with pragmatic 
implicatures. For example, in English non finite complement clauses induce the zeroing of the 
first argument of the embedded clause according to lexical properties of the higher verb, for 
example (6e) opposes to (6f): 

(6e) John: had promised Luc: to £i come. 
(6f) | John: had requested Luc: to Ø come. 


In War, pronouns referring to speech act participants are omitted unless grammatically 
salient or focalized, see examples (15) and (17) as opposed to (6c) and (6d). 


3. di unexpected agent marker for first and second arguments 


Counter-expectation with salience in di is a grammatical relation involving the speaker and 
the argument(s)-verb relationship. To be selected as an element of counter-expectation in di 
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highlights this element as an agent of this surprise, be it a first or a second argument. English 
has no way to convey exactly this grammatical meaning. Though inappropriate because it is a 
discursive operation, I have no other choice in many examples than to approximate the 
translation with focalisations. (11) contrasts with (12) but in both cases the agent is the 
argument of the intransitive verb /a? ‘come’. Salience does not modify valency, i.e. la? 
*come' remains an intransitive verb in (12), as it is in (11). The same is true for transitive 
verbs where the second argument is salient. In other words, when salient, arguments should 
not be confused with adjuncts introduced by the same markers used as prepositions, as shown 
in $6. 


(11) de la? U sla: 
PRF come  3MS rain 
‘The rain has come.’ 


(12) da la? di "U phrua. 
PRF come SALa 3MS hail 
*[Unexpectedly ]), [it is] hail [which] has come.’ 


A salient argument may be further focused as in (13), where it is permuted before the 
assertive marker a ‘DCL’. In (13) the hero answers some fairies who accused him of being 
rude: 


(13) di ihi ki hantha tzat 2 lea?  khlem dor 
SAL 2sr PART female kind DCL do without manners 
‘you [unexpectedly], beings from the female kind, behaved without manners.’ 


In (14) and (15), expressions in English like ‘all by himself? and ‘in person’ convey a 
meaning close to the kind of salient information involved, more precisely than a focalization: 


(14) a Yani "u-na u sni di erau 
DCL make  M-PROX M house SALa 3MSsr 
‘He has made this house all by himself.’ 


(15) tzu lea di nje 
CONS go SALa 1ST 
‘I will go in person.’ 


Salience in di may mark a second argument as in (16b) where wo? Monti is the agent of a 
surprising encounter for the speaker. The second argument wo? Monti is interpreted as the 
agent of the grammaticalized unexpected information. However, this foregrounding of the 
second argument wo? Monti does not background the first one and the speaker remains the 
thematic agent who experienced the meeting in (16b) as in (16a). English is unable to render 
this kind of double layered agentivity. di may be used in the same utterance with the 
meanings of ‘unexpected’ and in ‘person’ as in (16c): 


(16a) do ja-to? U) wa? Monti. 


PRF  JTRN-meet 3MS wah Monti 
"TU met Woh Monti’ 
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(16b) do ja-to? di U wa?Monti. 
PRF  TRN-meet  SALa 3MS wahMonti 
*[It is unexpectedly] woh Monti [whom I] met.’ 


(16c) "a/tgu la? ’u=tə di e?au 
NEG go this=one SALa him 


2 pha? “u di ?u wa? Karor 


DCL send 3M SALa 3MS_ wah Kerong 
‘this one did not come in person, he sent (unexpectedly) wah Kerong.’ 


As a second argument, a salient agent may be inanimate. In the context of (17), Krishno 
asks for food against money to a damsel but as an answer, in (17) she offers betel nuts as a 
sign of free hospitality. kvua ‘betel nut’ under the salience marker di becomes some kind of 
agent of an unexpected implicit proposal of love affair: 


(17) eram, bua phrang di i ua, 
2MSsr eat first SALa P betelnuts 
“You, [as opposed to what you expect] you will eat first betel nuts.’ 
‘(As opposed to being a paying guest as you expect, you will be my host). 


4. ha? salient marking of a second argument involving benefit and/ or empathy 


According to the lexical predicate and to the animate or inanimate character of the second 
argument, ha? marks the second argument as an agent of a primary or secondary benefit 
either for himself or for the first argument and/or empathy for the first or second argument 
according to constructions. 

If animate, the marked argument can express: 

- a chosen beneficiary as in (22) 

- the unaware beneficiary of a punishment as in (23); the unexpected recipient of an 
inadvertent inappropriate doing from the first argument as in (24); the recipient of the 
unwilling frightening made by the speaker as in (25); In (26) the recipient of the unwilling 
detriment is the speaker who requests empathy from the interlocutor ha “YOU FEM’ in order to 
get back his belonging. In (25) Aa? marks the child as an unwilling victim and requests 
empathy from the interlocutor. The adjunct /aya ‘bear’ is introduced by di as an instrumental 
marker, see $6. 

In all constructions where the animate marked argument is not a direct beneficiary, the 
marking involves empathy with the interlocutor or with the marked second argument. 


Q2) a temphua di eraka ha? erau. 
DCL greet SALa ` Aber SALs | 3MSsr 
‘She unexpectedly [was the one who] greeted him [as her 
chosen one].’ 


(23) a dat | mo ha? "u hun. 
DCL beat Is SAL, MS child 
‘I have beaten [my] child [for his own sake].’ 
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Q4) a pon-sá? dje'u ha? eraha. 
DCL  make-feel sad SALI  2FSsr 
‘I have hurt you [unwillingly, forgive me].’ 


(25) di ko Jana a pa-ktiay ya ha? "u hambo. 
INST FS bear DCL  CAUS-frighten 1s SALi M child 
*[It is unexpectedly with] the teddy bear, [that unwillingly] I frightened the child.’ 


(26) a lum hə ha? ñ" ton me. 
DCL take 2FS SAL P OWN IS 
“You have taken my own [unwillingly, so give it back ]." 


If inanimate, ha? can mark a SECOND argument as the agent of a benefit for the first 
argument. In (27) it is a favourite thing for the implicit ‘I’ first argument. In (28) it is a 
malevolent agent defeated by the explicit focused first argument: 


(27) ha? ^  fmeja Tra ^ "i nu Khongla? a maa tam na 
SALs P stones P=DIST P PROX KHONGLAH DCL love much I 
‘The stones, those near Khonglah, I love [them] so much 


(28) 23.pnagyo a pa-yri ya ha? U kffo:u 
MYSELF DCL make-free ls SALB MASS illness 
‘Bu myself, I have freed illness [for my own satisfaction].’ 


5. Salience involving an implicit lexical predicate 


Yen nje ‘wait for me’ and ma? nje ‘look at me’, where nje ‘me’ the second argument of the 
two lexical predicates is expressed directly, contrast with (29) and (30). The marking of 
arguments in Aa? or in di with two sets of verbs involve specific lexical zeroings and are kind 
of lexicalizations, somewhat as English allows metonymic ellipsis (of contextual content 
words) in (31) and (32)as opposed to (33) and (34) which do not contain metonymic 
zeroings: 


(29) en ha? nje. 

wait | SALs ` lSsr 

‘Wait [and watch my belongings] for me !' 
(300 ma? di nje. 


look SALA UE 
‘Look at me, [beware and imitate me].’ 


(31) Ihave drunk the bottle < I have drunk the beverage of the bottle. 


(32) The whole street had whistled when Holland arrived. < People in the whole street had 
whistled when Holland arrived. 


(33) Ihave broken the bottle. 


(34) The street is under repairs. 
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(31) opposes to (33), while (32) opposes to (34). Those ellipsis occurring with salience 
markers and appropriate lexical verbs occur in imperative utterances, which induce a request 
and are directly related to their usual uses: with ha? request of empathy enlarged to a request 
for help (for the benefit of the speaker); with di the interlocutor is requested to act in a way he 
cannot guess by himself, imitating the speaker. 


6. AA deictic origin, polyfunctionality and cline of grammaticalisations of ha? and di in 
War 


6.1. AA deictics ha? and di 


Salience argument markers are of deictic origin like many of the particles of DAS. In their 
salient function they keep something of this origin as surprise in di or empathy in ha? is 
directed toward the animate first argument or toward the interlocutors. Pinnow (1965) 
reconstructs AA personal pronoun systems and shows that they are from deictic origin. 
Pinnow (1965: 33-35) describes hih “here” in Nicobarese and reconstructs *hVn as *hV >u 
space and directional deictics in Munda: Santali han “this way, far away", Mundari and 
Kharia han “yonder, there", Santali hon “that, at some distance". Shorto (2006: 65 and 14352) 
analyses further ha? and hey? deictics in Bahnaric and in Aslian: Semang has ha? teh “there”, 
Chrau he:? here!, this (pointing), Bahnar hey “just now, that just mentioned", Mah Meri ho? 
“here”, Mintil A3? “here, this". 

Pinnow (1965: 36) analyses *di-/ de- in Bahnaric: Chrau di-ao “this” and di “3 SING” 
honorific pronoun. 

In PWL, in War di is a directional pointing deictic di=na “in this direction near", di=ta 
“in this direction, further". 

In Pnar ha? is used as a space and time deictic: ha? i tae “after that place, then", ha? pray 
“further on”, ha? pardi” in the center" and also as a beneficiary adjunct marker: ha? u=ni “to 
this one” (Daladier, unpublished data in Mawkyndeng Pnar). 


6.2. Adjunct markers: di instrument, ha? beneficiary and ‘about’ 


Adjuncts oppose arguments as 1) they are not lexically constrained by verbs and 2) particles 
introducing them are obligatory. 

di has an instrumental value as an adjunct marker, as in (35) used as an adjunct marker of 
motor ‘car’ for the intransitive verb /ea ‘go’: 


(35) tzu lea di motor. 
CONS go INST car 
‘I will go by car.’ 


ha? is a benefactive adjunct marker in (36) and ‘about’ adjunct marker in (37): 
(36)  pe-ffri por ha? nje! 
make-free time BEN Is 
‘Make free time for me.’ (give me some time.) 
(37) tzu perom | ha? 7 ksem 


CONS story ABOUT  3P birds 
‘I am going to tell a story about birds.’ 


100 


6. Differential marking of arguments in War 


6.3. Grounding marker: ha? kind of inferential force, di teasing mood 


In initial grounding position of a clause, usually combined with another grounding marker, 
ha?, states that the main lexical predicate, action or event, takes place under a higher source 
of action or source of knowledge, like fate or first hand sensorial knowledge. ha? 
grammaticalizes some kind of inferential value, where the action, either bad as in (38), or 
good as in (39), takes place without the will of the main actor. It can also be used in a 
dependent clause where the value of necessity depends on the higher verb and its first 
argument as in (40), (41), (42). In (41) ta? is the lexical verb ‘know’, as it is followed by its 
first argument. 


(38) ha? a kia ti-lo ti tg ga 
Awre DCL sit 3MS in=here in tape Is 
‘[fate makes it that] he was sitting here, on my tape.’ 


(39 ha? a mon hea? kə ha? e?au. 
Amr; DCL love much 3FS Sale 3MSsr 
*[Fate makes it that] she deeply loves him as her chosen one.’ 


(40) ‘a=tzu phu? kwa? 'ucta ha? tzu di lok. 
NEGrasT YET wish M=DIST Amar CONS get spouse 
‘That one had not wished yet [to have the fate of] getting married.’ 


(41) a ta? ya ha? a a? “u tipo? sni u. 
DCL know Is Amr; DCL have 3Ms inside house 3MS 
‘I know [for sure that] he is in his house.’ 


(42) a ra: posa me ha? tzu kti tzea ka. 
DCL give money Is Awre CONS buy vegetable 3rs 
‘I gave money [to ensure that] she buys vegetables.’ 


As a grounding particle, ha? is often used to express empathy with the interlocutor, or 
to deny responsibility, for the benefit of the speaker or an argument, as in (40) or to ensure 
that something has taken place or will take place. In that way ha? ‘inferential’ may be 
considered as some kind of grammatical extension of its differential argument marker uses. 

ha? and di contrast in (43) and (44), which have opposite meanings. As an assertive 
marker, di expresses a kind of teasing force (as what is said is too much unexpected to be 
believed) while Aa? ascertains the utterance: 


(43) tzu labudia | me ha? 7 a on. 
CONS believe Is Amr; WHAT DCL say 
‘Sure I will believe what you say.’ 


(44) tzu labudia ya di " a on! 


CONS believe Is Aras WHAT DCL say 
‘Sure I will believe what you say! (how could it be?)’ 
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6.4. Cline of grammaticalisation of ha?and di in War 


There is some kind of continuity between deictic uses of di and ha? and their different 
grammaticalisations. di and ha? marking arguments highlight some kind of subjective 
direction of the process from the view point of the speaker toward his interlocutor or between 
two arguments: kinds of connivance and request of empathy. As adjunct particles and as 
argument markers di and ha? have agentive and beneficiary related meanings, except for ha? 
‘about’. ha? and di have grammaticalized further as assertive particles. ha? may involve some 
kind of implicit higher agent who drives action or ascertain knowledge for the best or for lack 
of guilt. di grounds an utterance in a teasing way and extends the value of surprising agent of 
the differential marking to an unbelievable utterance of the interlocutor. Grounding particles 
indicate the direction of the speaker's attitude toward the interlocutor. 

In South Munda, di is still used as a differential argument marker in addition to clitics 
referring to arguments in the verbal base. In Gorum -di highlight agentive roles (see $8). In 
Bonda also, -di highlights agentivity (instrumentality or source of the process), see 
Bhattacharya (1968: 1212). In Khana -di has switched and is suffixed to any argument to 
highlight beneficiary roles, see (55), (56), (57). 

In Semelai, /a used in the differential marking of arguments for agentive roles is also used 
as an adjunct marker for the causal source of a process, see examples (51), (53), (54). 

The cline of grammaticalisations of ha? and di in War might be schematized as follows 
with a parallel grammaticalisation of their uses as differential argument markers and as 
adjunct markers. Productive grammaticalisations into grounding elements is a wider feature of 
War. 


optional argument marking 
AA directional deictics grounding particles 
adjunct marking 


Figure 1 — Cline of grammaticalisation from deictics to differential markings and to grounding particles 


7. Differential argument marking as a sub-system of the assertive dependency system 
of War 


7.1. The assertive dependency system of War and the interaction of its sub-systems 


The letters in Table 1 correspond to a grammatical hierarchy induced by my data on a big 
corpus of oral literature. This hierarchy appears to be defined both by word order and by 
combinatorial properties of markers in each category. Differential marking of arguments for 
salient semantic roles, G, is the lowest sub-system of the assertive system. 

A, B, C categories of this assertive system construct in initial position and an occurrence 
of at least one of them is obligatory in utterances except for simple imperative utterances and 
declarative utterances headed by deictic or by possessive relationships. Other categories of 
this assertive system are optional. 

Reference to the subjectivity of interlocutors is often involved in B, C and G categories. 
The particles of this system interact in a structured way. C, D, E, F, G are lexically 
constrained by the verb and in parallel grammatically by higher particles. 
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Table 1- Overview of ADS in War 


A  Assertive-correlative particles in narration. Express values like ‘after this’, 
‘this being done’, ‘at that point’. Frame sets of clauses in monologue or in 
narratives as episodes from the causal-temporal-aspectual viewpoint of the 
speaker 

Initial position of a clause with assertive function. 

B |Illocutionary forces 

Hortative, precative, injunctive and prohibitive markers. Polar and non polar 
question markers. Positive and negative mirativity 

Clause linking uses 

Initial position in the utterance 

C | Positive and negative aspect and subjectivity markers (e.g. intentional, 
empathy, deliberate, agreement, negation-aspect, evidential, mirative, 
continuative, consecutive, perfect, prospective) 

May construct with simple nouns 

Combine within B and within C 

Clause linking functions in initial position of dependent clauses with values 
relatives to assertive markers in the main clause and lexical features of the 
main verb 

Initial position in a clause (except /o ‘negation-future) 

D |Modal particles non declarative. Different kinds of ability and necessity with 
subjectivity values; agreement and emphasizing particles 

Combine within D; extend C , E and G values 

Pre-verbal or final position 

E |Auxiliaries (grammaticalized serial verbs) 

Combine within E and under common A and B markers 

Extend some of C, D and G values 

May change or add valency 


Pre-verbal 
F | Aktionsart markers (causatives, one kind of reciprocal); one non assertive 
negation 
Aktionsart may be transitiving; combine within F 
Verbal prefixes 


G | Optional marking of arguments. Highlight specific semantic roles with 
reference to subjectivity: unexpected agent in di and benefactive or 
inadvertent or specific in Aa?. Pre-posed to arguments 

Extend some of C and E values 


7.2. Comparison of agentive values conveyed by verbal voice in Indo-European 
languages and agentive values conveyed by the assertive system of War 


War has no passive voice in the sense of a verbal marking with passive auxiliation, which 
intransitives a transitive verb and conveys both a stative interpretation of this verb and an 
affectedness interpretation of its argument. In War, intransitiving auxiliaries may foreground 
an argument with many kinds of agentive values while in a language like English, the passive 
voice only foregrounds "affected" arguments, see Daladier (2012). 

Gradient active and passive values together with other subjective values are conveyed by 
auxiliaries. The marking of arguments for agentive roles are constrained not only by lexical 
verbs but also by other grammatical features of constructions like auxiliaries and causative 
prefixes. An intransitiving auxiliary as in (46) foregrounds as a first argument agent a former 
second argument of a transitive lexical verb as in (45), and this foregrounded subject may in 
addition be marked as a salient agent in an appropriate construction as in (47): 
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(45) a burom i Nongbareh "u. 
DCL praise 3P  Nongbareh 3Ms 
‘He praises Nongbareh people.’ 


(46) a sa? fa? burom — Nongbareh. 
DECL AUXreL AUXDETRM honour 3P Nongbareh 
*Nongbareh people feel dis-honoured.’ 


(47 a sa? fa? burom di iï Nongbareh. 
DCL AU ern. AUXperm honour SALA Ap Nongbareh 


*[It was unexpectedly] the Nongbareh people who felt dishonoured’ (because they lost 
in place of those who were expected to lose). 


Salience in di in transitive constructions may add some interesting nuances to some 
deponent values of intransitive constructions. Deponent values are not marked in War by a 
Middle voice; they are simply expressed lexically by intransitive lexical verbs, as in (48). It is 
the transitive corresponding value which is marked by a causative prefix, as in (49) while the 
salient transitive construction of the unexpected agent in (50) keeps something of the 
deponent value of (48) but from the view point of the interlocutor (The light went out 
unexpectedly for the interlocutor, not for the agent speaker): 


(48) də ple? kə lajt. 
PRF go out(INANIMATE) 3Fs light 
‘The light went out [by itself, without being switched off].’ 


(49) da tam-ple.:? ka lajt ya. 
PRF  CAUS-goout  3rS light Is 
‘I have switched off the light." 


(50) de tam-ple.:? ka lajt di nje. 
PFV | CAUS-goout 3Fs light ^ SAL.AI ISST 
‘I [myself] has made the light switched off [surprisingly for you ].' 


The salient agentive marking interacts with the auxiliation sub-system and produces in 
combination with it, a very rich set of gradient agentive values. Salience foregrounds 
arguments without valency change in its own grammatical way. 

ha? and di as differential argument markers involve a secondary grammaticalized action 
with many verbs added to the lexical information: with ha? an implicit request for empathy or 
even help and with di some kind of request to pay attention to something unexpected. This 
grammatical information should not be confused with pragmatic implicature. 

As an assertive particle ha? specifies in a main clause that the action or event is driven 
beyond human will of the agent by fate or by God. Under this epistemic use of ha?, the 
understated higher agent (God or fate or main clause argument) has a benefactive or a 
detrimental role and the speaker expresses empathy toward the view-point of his interlocutor. 
This renews the benefactive and inadvertent salient uses of bad, 

Differential marking in di is complementary to two morphologically different positive and 
negative mirative assertive particles. Mirative in War expresses both a surprise force and 
sudden consciousness of a state of affair opposite to what was expected by the speaker. 
Positive and negative grounding particles are analysed in Daladier (2010). 
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8. Vestiges of differential marking of arguments in two different AA verbal systems 


Munda languages (see Pinnow (1966)) like many Tibeto-Burman (TB) languages (see 
Thurgood (1985), LaPolla (1992)) have verb agreement systems marked in a verbal base 
where clitic(s) referring to argument(s) highlight these arguments according to a semantic 
hierarchy like: speech-act participants > third persons; human/animate > non 
human/inanimate; new/unexpected information > already known/expected information. Some 
features of these hierarchies appear to be shared by salience in the ADS system of War and 
similarly in Pnar and in Lyngam different ADS (unpublished data). In War, pronominal 
arguments referring to speech act participants are omitted. These agreement systems inside 
verbal systems and salience in ADS grammaticalize in different ways several common 
features: the first argument is not the only one which can be foregrounded, one or two 
arguments may be foregrounded but semantic roles involved are mostly the same: agentive 
and beneficiaries. 

LaPolla (1992:301) shows that a majority of TB languages have no verb agreement and 
“no trace whatsoever of having had one”. Arguments highlighting within a verbal basis with 
clitics referring to arguments are analyzed as secondary renewals in TB languages by 
Thurgood (1985) and in Tibetan by Tournadre (1997). Pinnow (1966) analyses the diversity 
of Munda verbal bases and shows why verb agreement is a renewal, especially because they 
are less developed in conservative South Munda languages. 

Salience in War foregrounds a rich variety of agentive values as it can promote either a 
first or second argument and combine with various auxiliaries which already produce a 
gradience of active and passive values, see Daladier (2012). Salience promoting unexpected 
agents may also combine with mirative grounding particles. Beneficiary and unwilling 
detrimental roles of arguments can be highlighted as well as agentive values. Neukom (1999) 
shows that in Santali, a North Munda language, beneficiary elements and sub-clauses 
expressing a goal are cross referenced in the verbal base by clitics. Neukom also shows that 
the foregrounding marking by clitics referring to arguments in Santali verbal bases has 
common features with that of Hayu, a TB language analysed by Michailovsky (1988). 

Salience marking of beneficiary arguments in ha? in War and referential clitics in the 
verbal base marking salience in Munda and in Hayu show that beneficiary roles may be as 
salient as agentive roles in different verbal systems, in some AA and TB verbal systems. In a 
fascinating way, Wolff (1973:79) shows that Javanese, Visayan, Tsou, Atayal (Philippine 
languages) have unconventional verbal systems where an agentive marking cannot be 
confused with an instrumental case and where agentive and beneficiary roles can be 
foregrounded. 

Salient markings of an argument of specific lexical classes of verbs like to ‘love’ produce 
a strong chosen beneficiary value in War and in Semelai (Aslian), see Kruspe (2004) and also 
in some Philippine languages, see Ross (2002) and Blust (2002). Aslian and PWL are located 
in the extreme North West and South East of the MK era. 

Optional agentive marking on arguments, in addition to clitics in the verbal base, have 
been noticed in various TB, Austronesian and MK (Aslian) languages, especially by Matisoff 
(2003), and MacGregor (2006) and considered as some peculiar kind of split ergative 
markings. These phenomena often involve subjectivity markings complementary with 
volitional, evidential, active and passive values which grounds constructions as utterances. 
The system described by Matisoff (2003:46) for Temiar after Benjamin's data, with many 
interesting semantic findings, seems in fact to show vestiges of an optional salience of 
arguments. This salience would remain within a verbal system in addition to subject-verb 
agreement rather than being a split ergative system because it also applies to intransitive 
verbs. Temiar differential marking for agentive roles looks similar in different respects with 
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the system described by Kruspe (2004) for Semelai, with the same particle /a under the 
terminology of optional oblique marking after she gave up the term of split ergative. 

Traces of highlighting differential markings on arguments, in addition to the conventional 
highlighting system marked by clitics referencing argument(s) in the verbal base, are found in 
conservative South Munda languages and in Aslian in their different verbal systems, for 
agentive or for beneficiary roles. 

Kruspe (2004) shows that in Semelai, transitive and intransitive verbs may have their first 
argument marked either as a direct or oblique agent argument. An oblique (or here salient for 
reasons given below) agent in Semelai is always animate. With intransitive and transitive 
verbs, a salient agent may be an external or involuntary source of the process. Encoding the 
agent role may vary according to the aspectual value of the construction, it is marked in (51) 
with /a ‘A’ where it is permuted after the verb, but unmarked in (52) which is SVO: 


(51) dan la=?ma?=hn. 
3A=hear A-mother-POSS 
‘Her mother heard (her).’ 


(52) sma? pok  rbana. 
person beat drum 
*People were beating drums." 


The marker /a is used again for simple or clausal adjuncts having an agentive role as a 
source of the process or as a source of information, in constructions which have a perfect or 
resultative aspect and which differ from causative constructions. The particle /a used as 
agentive marker which highlights some kind of involuntary source in (51) also links adjuncts 
with a causative source of the process value in (53) and (54). This double feature can be 
compared to the use in War of markers as salience argument markers and also as adjunct 
markers with related values: 


(53) ki=?jam la=je. 
3A=cry BCS=1S 
‘He cried because of me.’ 


(54)  ki—c?en la=bapa? drot, 
3A=be content BCS-father return 
‘She is happy because her father returned.’ 


The agentive salient marker of core arguments /a also links simple or clausal adjuncts 
with a causative value, which is an extension of its source of the process value. 

The verbal system of Semelai as described by Kruspe still shows a rather non 
conventional verbal system, especially utterances are not grounded on tense, there is no finite/ 
non finite opposition, subject-verb agreement, like salience, is optional and depends on 
aspect. 

Very interestingly, *di/da deictic in Munda is also renewed as a salient differential marker 
of first and second arguments in several South Munda languages. This is described 
independently by different linguists under various analyses. In Kharia, see Malhotra (1982: 
174-178) and in Gorum see Aze (1973), quoted by Anderson (2007: 45). In Gorum, di is used 
as what I call a salience marker for agentive roles, suffixed to arguments, rather than a 
discursive focus marker as described by Anderson (2007). From the examples given, one 
might guess that in addition to highlighting an agentive role, di very interestingly also 
expresses that this agent was unexpected for what it did. It seems from the data given by 
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Malhotra in Kharia that -di also is more specifically than a focal particle, a salience marker in 
the grammatical sense analysed here, but with a switch for beneficiary values of core 
arguments, marking a first argument in (57) and a second argument in (55) and (56): 


Kharia, Malhotra (1982: 174-178) 


(55) gotuy  Bolram-di ^ etur-da?m-u. 
cloth Bolram-FOC  S-cover-o 
‘A cloth covers Bolram.’ 


(56) no?» Bolram-di |^ etur-da?m-y gotur batur. 
3 Bolram-FOC  s-cover-o cloth with 
‘He covers Bolram with a cloth.’ 


(57)  Bolram-di da?m-u duku. 
Bolram-FOC  cover-O AUX 
*Bolram is covered.’ 


9. Conclusion 


War, like Pnar and Lyngngam, but unlike Khasi, has developed a conservative isolating 
morphology with an assertive dependency system, ADS, and has no cline toward a 
conventional verbal system, especially no kind of argument(s)-verb agreement. Instead, War 
has a differential marking highlighting core arguments for agentive and beneficiary or 
detrimental roles with values of agentive unexpectation for di and beneficial empathy for ha? 
referring to the subjectivity of interlocutors or animate arguments. This differential marking is 
analyzed here as the lowest sub-system of an ADS with seven levels in War. 

Questions on relevant grammatical categories and their evolution in Sino-Tibetan have 
been raised, especially by Thurgood (1985), Li and Thompson (1976), LaPolla (1993), 
LaPolla and Poa (2006) concerning interactions between morpho-syntax (in a conventional 
sense) and pragmatics; they can be related to questions on dating verbal agreement and the 
very nature of ergativity in TB raised by Thurgood (1985), LaPolla (1992) or in Austronesian 
the question of voice discussed in Wouk and Ross (2002) and the question of so-called 
instrumental and benefactive cases raised by Wolff (1973) for some Philippine languages. 

I propose an inductive method for defining relevant categories of a relevant structured 
grammar for War. The grammar of War is not pre-verbal as it has evolved renewing its own 
isolating particles innovating within its own grammatical semantics which includes different 
kinds of reference to the subjectivity of the interlocutors in the different layers of its ADS. 
Most of the particles of ADS in War have an AA origin as deictics, see Pinnow (1965). 

Salience markers have complex construction properties in the technical sense that they 
interact in parallel with higher particles of ADS and with lexical predicates on the argument 
they mark. Rather than simple tree dependencies in a syntagmatic syntax, they involve multi- 
terms dependencies between lexical predicates and other grammaticalized elements in lattice 
dependencies. The interpretation of salience markers in a construction depends on higher 
markers of the ADS, on the lexical verb, on animate/ inanimate features of the arguments and 
on the first or second position of the marked argument. As shown here, different agentive 
values conveyed in serial constructions, kind of passive and deponent values, interact with 
agentive values marked by salience in di. Unexpected information expressed by the agentive 
salient marker in di may combine with a mirative assertive marker. 

From the view point of the evolution of AA languages, the differential marking of 
arguments within an isolating morphology seems to be a conservative feature. 
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In Aslian (Temiar and Semelai) and in South Munda (Kharia, Gorum, Bonda), vestiges of 
a differential marking of arguments for either agentive or beneficiary roles, or for both, is 
used in addition to their own arguments highlighting systems inside rather different verbal 
systems. 

In War, AA spatial and directional deictics have grammaticalized into two salience 
markers and then have climbed the hierarchy of the ADS system to renew again as grounding 
particles: a teasing force and a kind of inferential assertive marker. 

The differential marking of arguments in War is an important feature of its ADS. This 
feature has vestiges in AA verbal systems. In AA optional verbal agreement with clitics 
referring to arguments seems to be a secondary feature, after ADS and before plain obligatory 
subject-verb agreement. 


Abbreviations 


1,2,3 Person pronouns or gender/ plural/mass terms classifiers 

1,2,3 sr Emphatic pronouns or pronouns encoding non subject 
arguments and adjuncts 

AB Ability modality 

Amt Assertive particle with inferential value 


Acor Assertive temporal-causal correlatives 
Aras Assertive teasing exclamatory force 

AUX auxiliary: grammaticalized serial element 
BEN Beneficiary for adjuncts 

CAUS Causative prefixes 

CL Classifier 


CMP Completely achieved 

CONS ` ai Imminent (consecutive) event or action or intent in a 
main clause. b) Consecutive in correlation to a higher assertive marker in a 
dependent complement clause. c) Purposive (consecutive goal) marker in an adjunct 


clause 
CONT  Continuative or lasting state as an aspectual marker; past temporal deictic 
DCL Grounding particle with mild involvement of the speaker 
DIR Directional deictic. Further sub-classified DIR» for north and upward, DIRs for south 


and downward 
DIST Distal deictic 
PROX Proximal deictic 


EMP Emphasizes a request or an affirmative answer 
EV Eventual modality 

F Feminine 

FIN Process to be performed up to the end 

HAP Happenstance 

INST Adjunct instrumental 

INT Interrogative pronoun 

M Masculine 

MA Mass term 


NEUT Neuter 

NEC Necessity 

Neg plain negation without grounding function 
Negp Grounding Negation with accomplished aspect 


6. Differential marking of arguments in War 


Negr | Grounding Negation with potential, consecutive or expected aspect 

P Plural 

PART ` Kind of partitive article which refers to a community subgroup 

PRF Perfect as an assertive marker; ‘already’ or ‘ago’ values as a nominal deictic 

PROX Proximal deictic 

REM Distal deictic 

REP Repetition of a process 

S Singular 

SALA  Agentive unexpected salience marking 

SALs ` Benefactive salience marking 

TRN Aksionsart prefix. To perform an action in turn or with reciprocity depending on 
lexical features of the verb 
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Abstract The morphological processes such as affixation and compounding result in some eye catching 
morphophonemic variations in the Kamrupi Dialect of Assamese. This paper looks at the nature of those 
variations and processes as a result of affixation. 


We are focusing on the morphophonemic variations in the finite verbs since verb has the greatest 
potential for such variations because of affixations. Presenting the paradigms of some verbs, we are 
looking at the way how the structural components of the forms of a verb phonologically affect each other 
to result in forms which give very little indication about the root forms of each. The same set of data are 
presented paradigmatically, or vertically on one hand and syntagmatically or horizontally or in the linear 
order on the other. In the paradigmatic organisation, the paradigms of a verb are arranged and in the 
syntagmatic or linear order, we are organising some selected forms of the paradigms according to the 
linear sequence in which the root and their affixes (suffix in this paper) occur. This gives us an idea about 


the processes and conditions such as metathesis, assimilation, gemination and deletion etc. that are 
accountable for forms which give almost no idea about the root forms of either the verb or the affixes. 
This paper will look at the verbal forms on one hand and the morphophonemic processes on the other by 
putting special focus on the nature of the variations and the environments where they take place. 
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1. Introduction 


Assamese or Asamiya, the official language of Assam is a member of the Indo-Aryan family 
of languages with two major varieties; the upper Assam variety and the lower Assam variety. 
The standard or the official variety of Assamese developed from the upper Assam variety. Dr 
Bani Kanta Kakati divided Assamese into two broad dialectal groups- the Eastern and the 
Western variety. The Eastern variety, according to Dr Kakati, ranging from Sadiya down to 
Guwahati “hardly presents any notable points of difference from the spoken dialect of 
Sibsagar, the capital of the late Ahom Kings" (Kakati, 1941:49). On the other hand, as he 
observed, the two major Western districts of Kamrup and Goalpara have several local dialects 
which display notable difference from each other as well as the standard counterpart of 
Eastern Assam. However, Goswami and Tamuli (2003) mentioned a third one, i.e. the Central 
or an Intermediate dialect: 


According to this new regrouping, Eastern Asamiya is spoken in the districts of Sivsagar and Lakhimpur, 
shading off the contiguous areas of Arunachal in the east and down to the districts of Sonitpur and Nowgong 
in the west. Western Asamiya covers a fairly big area from a little east of Guwahati in the south and the 
Darrang district in the north, and down to the district of Goalpara in the west. The central or intermediate 
dialect occupies the area in between the two regions mentioned above, i.e. the entire Morigaon district 
extending to a little east of Guwahati. (Goswami and Tamuli 2003: 400) 


! It is a great pleasure to express our heartfelt gratitude to Prof. Jyotiprkash Tamuli, the Head of the Department 
of Linguistics, Guahati University, Guwahati, Assam, for suggesting us the topic and offering constant help and 
support of varying kinds. We thank all the teachers and research scholars of the Department of Linguistics, 
Gauhati University for their cooperation and inspiration. We would like to thank the reviewer of the drafts of the 
paper for the valuable comments and suggestions. Special thanks to the Editorial board, especially to Stephen 
Morey for his patient cooperation in regards of editing and fine tuning of the work. 
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The present Assam was known as Kamrup as well as Pragjyotishpur, as evident in many 
references in ancient Indian Literature. However, "Kamrup" became a more predominant 
name in the later part of the history. Although modern Assam is no longer known as Kamrup, 
in the 20" century a district named Kamrup was formed where Guwahati, the capital city of 
Assam is situated. The Kamrupi dialect is not spoken in the present Kamrup district alone, but 
in the nearby districts too, such as Nalbari, Barpeta, some parts of Kokrajhar, Darrang, 
Bangaigaon etc. The name Kamrupi is derived from the ancient Kamrup Kingdom that 
existed from the fourth to the twelfth century. 

Goswami (1970) observed the presence of some words and phrases of the present day 
Western Assam variety, especially the variety spoken in the undivided Kamrup district in the 
classical literary works of the first Assamese poet Hem Sarasvati in the thirteen century, 
Madhab Kandali in the fourteen century etc. Hem Sarasvati's “Prahlad Carita" and Madhab 
Kandali's translation of the epic *Ramayana" are noteworthy in this regard. Bharali (2004) 
opined that this is a historical fact that during the period of its formation and development, 
Assamese was spread from the west to the east. Kamrupi had its wide spread use as the 
language of literature in the medieval age. Various political and cultural issues such as lack of 
stability in government, frequent foreign invasions etc. have caused a gradual demarcation in 
its use and it took a form of a local dialect in the long run. 


Before the seventeenth century, Kamrupi was the literary language of Assam. Today, the notion of Kamrupi 
language includes the spoken dialects of Kamrup, Nalbari and Barpeta districts with some inter-variations 
among them. (Sarma 2009: 96) 


There are major political and socio-cultural issues to fuel the emergence and 
establishment of the present day standard Assamese variety. It shares some similarities as well 
as differences with the Kamrupi variety. A good number of core linguistic studies have been 
carried out based on the standard variety of Assamese but the same kind of works on the 
Kamrupi variety are relatively fewer in number. Our attempt is not to project a comparative 
study of the morphological and phonological features of the two major dialects, but to 
introduce the wonderful interplay of morphophonemics which is unique to Nalbariya 
Kamrupi (the variety spoken in the Nalbari district). This paper makes an attempt at 
uncovering the diverse positional and articulatory alterations that a single sound may undergo 
at various levels of morphological processes. The phonological environments figured by the 
morphological processes cause in turn some significant phonological changes to the sounds 
involved in a word under certain conditions. Here we are trying to move our focus from 
‘what’ (from) to ‘how’ (process) by projecting the same sample of data in two different 
orders- the paradigmatic and syntagmatic order respectively. To avoid extreme lengthiness as 
well as a haphazard presentation of data, we have looked at the morphophonemic variations 
by affixation excluding the other equally influential morphological process, ‘compounding’ 
from our analyses. The term ‘word’ in this paper refers to the phonological entity as opposed 
to a graphological one since Kamrupi does not have an officially recognised written form yet. 

We have chosen to focus on verbs rather than the other word classes for verb is the 
category in Assamese with the greatest potential to show most of the variations as a result of 
affixations. Moreover, we are concerned only with the finite affixations as opposed to the 
non-finite ones. 


1.1. The preliminary observation 
The ground-breaking observation leading to the present topic was that in Nalbariya Kamrupi, 


an alveolar flap /r/ gets assimilated to an immediately following alveolar or a lateral sound. 
Some other types of morphophonemic processes may take place too, under certain conditions 
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if there is a vowel is in between them. Such processes include metathesis, assimilation as well 
as omission and formation of a new sound etc. This paper aims at discussing about the nature 
and kinds of the morphophonemic variations and processes mentioned above along with their 
exceptions. Consider example (1): 


(1) bor ‘big’ + deuta ‘father’ —boddeuta ‘father’s elder brother’ 
*bordeuta 


In example (1), bar ‘big’ and deuta ‘father’ are compounded into boddeuta ‘father’s elder 
brother’ and in this process, the final r of bor gets assimilated to the immediately following 
alveolar stop d of deuta, resulting in the gemination of d. 

Based on this initial observation we looked at some other instances with similar kind of 
phonological environments to see how the sounds behave in those. We observed that the 
nature of the phonemic alternations in certain phonological environments is not random as the 
sound in question is conditioned by the sounds it occurs with. 


1.2. The two approaches: paradigmatic and syntagmatic 


As mentioned above, verbal affixations in the Kamrupi dialect result in a number of sound 
variations including metathesis, assimilation and gemination of sounds as well as omission 
and formation of new sounds depending on the phonological environment wherein a certain 
sound is present. These variations further result in forms which are pretty difficult to break 
into separate morphemes since the morphemes unlike those in the standard counterpart are not 
simply juxtaposed but somehow buried in each other i.e. the morpheme boundaries, unlike 
those of the standard variety cannot be clearly defined. Hence, for the help of presenting the 
verbal morphophonemic variations, we arranged the data in two orders; paradigmatic or 
vertical where the paradigmatic forms of a verb are presented and syntagmatic or horizontal 
where the linear sequence of the verbal components are presented. While the paradigmatic 
approach gives an idea about the allomorphic variations of the roots, the second approach will 
give an idea about the morphophonemic processes and conditions that account for the 
phonological variability of the paradigmatic forms. In other words, we broke down the 
paradigmatic forms into a syntagmatic or a linear order in various steps to get an idea about 
how the affixes behave in certain phonological conditions. To be more specific, we tried to 
move our focus from ‘what’ to ‘how’ by presenting the same sample of data in two different 
orders; paradigmatic and syntagmatic or linear order respectively. While doing that, we have 
the paradigmatic forms in one hand and the step by step syntagmatic texture on the other. 

For example, the paradigmatic approach shows the allomorphic variations of a verb as a 
result of affixation and the syntagmatic approach presents the linear order of various 
affixations to the verb and simultaneously demonstrates the processes and conditions 
accountable for the variability of the paradigmatic forms. 

In example (2a) is presented a paradigmatic form doilli ‘you held’ 2™ person past form of 
the verb dor ‘hold’ in the syntagmatic or linear order. 


Qa)  clv!c?- veg? 
dor-il-i 
hold- PST-2.NH 


cly!y2c2-c3-v3 


doir-l-i 
hold- PST-2.NH 
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clyly2c2- cw? 


doil-l-i 

hold- PST-2.NH 
clv!v2c2c3Vv3 
doilli 

‘you held’ 


Example (2a) presents the formation of doilli ‘you held’. An explanation is needed in this 
regard. The standard counterpart of this form is dorili where the boundary lines between the 
verb dor, the past tense marker -i/ and the person marker -i are clear-cut. On the other hand, 
the Kamrupi doilli gives no indication that its root form is dor; the final r sound is nowhere 
present in this form. This example shows how the verb dor is phonologically affected by the 
tense and the person suffixes to result in doilli, a form which gives us no clue to identify the 
root forms of either the verb or the suffixes. 

In the very first line, the verb root and the suffixes are shown in their sequential order. For 
the help of understanding, the vowel and the consonant sounds are numbered. The second line 
shows the metathesis of i and r which results in a final form in the fourth line. This final form 
is almost impossible to break into separate morphemes i.e. the verb root, tense suffix and 
person marker. The morphophonemic variations observed in example (2a) are: 

e Metathesis: the i of -il and r of dor. 

e Assimilation: r has assimilated to the immediately following / to form // gemination. 

e o gets changed into o when it immediately precedes i. 


However, the same example can be arranged in the reversed order which is not as 
elaborate as the order in example (2a), especially in regards of the presentation of the verb and 
the suffix roots. 


(2b) — clv!v?c?c?v3 
doilli 
‘you held’ 
clyly2¢2 
doil-l-i 
hold- PST-2.NH 


-c3-y3 


Example (2b) shows that the verb root is doil, not dor. There is no clue anywhere to 
predict the verb root. Though elaborate, the way it is presented in example (2a) may give a 
wrong interpretation that we are comparing the Kamrupi data with those of the standard 
counterpart. Hence, we followed some general strategies to identify the verb as well as the 
suffix roots. They are as follows: 

e The first strategy was to rely on the native speaker’s knowledge about the verbal 
forms in their dialect. The informants were asked to identify the verb roots from a list 
of paradigmatic forms. Along with many other verbs, they made confirmation with 
least hesitation that there is no meaningful verb as doil in their dialect and the verb 
root for this paradigmatic form is dor. 

e The other strategy is to look at the simple present tense paradigm of a verb. Since the 
Kamrupi variety of Assamese as well as the standard variety does not have an overt 
present tense marker, the simple present tense forms are constituted by the verb and 
the person suffix (which is a single vowel) only. So the verb root and the person suffix 
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are easily identifiable. Let's take a look at the simple present tense forms of the verb 


dor ‘hold’ in (Table 1). 
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Table 1 — Simple present tense forms of the verb dor ‘hold’ 


Singular Plural 
dor-u dor-u 
1 hold-1sG hold-1PL 
‘I hold’ ‘We hold’ 
dor-a dor-a 
2.nh hold-2SG.NH hold-2PL.NH 
“You hold’ “You hold’ 
dor-2 dor-2 
2.mh hold-2SG.MH hold-2PL.MH 
“You hold’ “You hold’ 
dor-e dor-e 
2.hh hold-2SG.HH hold-2PL.HH 
“You hold’ “You hold’ 
dor-e dor-e 
3 hold-3sG hold-3PL 
‘S/he holds’ ‘They hold’ 


Table 1 shows that when only the person marker is suffixed to the verb dor, the verb root 
does not undergo any change. But when the aspect or the tense markers are suffixed to the 
verb, it results in the forms like doilli ‘you held’ etc. We can try to go to the root form of the 
verb by looking at the phonological variations at various levels of morphological processes as 
we did in example (2a). But identification of the affix roots is relatively more difficult even 
for a native speaker. The discussion in section 1.3 will be helpful in regards of identification 
of affixes. 


1.3. The verbal affixes 
Verbal affixes 


Prefix Suffix 


Negation Tense Aspect Person 
Figure 1 — Verbal inflections 


Figure 1 presents the finite verbal inflections, both prefixed and suffixed to the root. This 
paper focuses on the verbal morphophonemic variations as a result of suffixation. In the 
Kamrupi dialect of Assamese, there is no overt present tense marker. Hence, in the simple 
present tense form, the verb is immediately suffixed by the person marker whereas in the 
simple past and future tense forms, the verb is immediately followed by the tense suffix and 
the person marker respectively. In the present imperfective form, the verb is followed by the 
aspect marker and the person marker respectively. In the past imperfective form, the verb 
precedes the aspect marker, past tense marker and the person marker respectively. The 
formation of the future imperfective is not similar to the present or the past imperfective; it is 
formed by a non-finite form of the main verb preceding some supporting verbs to which the 
future tense marker as well as the person marker where needed are inflected. Hence, we will 
not discuss about those forms in this paper. In some cases, the person marker also remains 
covert or absent. The verb-suffix sequence is shown in Table 2: 
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Table 2 - Verb-suffix sequence 


Present Past Future 
o verb-FUT-PERSON/ 
Simple verb-PERSON verb-PST-PERSON 
verb-FUT + PERSON 


verb-ASP-PST-PERSON/ 


Imperfective verb-ASP-PERSON 
verb-ASP-PST-9 


The imperfective marker is -is with both vowel and consonant final verbs, as in (3). 


(3a) Vowel final pa (3b) Consonant final kor 
paisa koissa 
pa-is-a kois-s-a 
get-ASP-2.NH koir-s-a 
“You are getting’ kor-is-a 
do-ASP-2.NH 


*You are doing? 


The past marker is -/ with vowel final verbs and -i/ with consonant final verbs, as in (4): 


(4a) Vowel final pa (4b) Consonant final kor 
pala koilla 
pa-l-a koil-l-a 
get-PST-2.MH koir-l-a 
“You got’ kor-il-a 
do- PST-2.MH 
“You did’ 


The future tense suffix is phonologically conditioned by the final sound of the verb it 
appears with. Voicing plays an important role in this regard. In Figure 2, the suffixes for 
future tense that are phonologically conditioned by the verb final sounds are shown: 


The future tense suffix 


vowel final verb consonant final verb 
-b -ib -i p^ -b 
i S i i 


-(*voiced)  -(-voiced) -(-voiced+asp)  -(breathy voiced 
and pharyngeal) 


Figure 2 — Suffixes for future tense 


As Figure 2 shows, the future tense suffix varies from verb to verb depending on the final 
sound of the verb. It is -ip with a verb ending with a voiceless sound, -ip^ with a verb ending 
with a voiceless aspirated sound, -ib with a voiced sound, -b with a vowel and -b with either a 
-h sound or a breathy voiced one. An exception is with the forms -im with consonant ending 
verbs and -m with vowel ending verbs, wherein future tense and the first person are fused. 

In Section 2, we present the paradigms of a number of verbs where the person markers are 
quite visible. It has been noticed that the person markers are not directly engaged in the 
complex morphophonemic variations unlike the tense and the aspect markers. They have 
varied appearance in varied tense and aspectual locations. The verbal paradigms in Section 2 
will give a clear idea about them. 
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2. The verbal morphophonemic variations: applying the paradigmatic and syntagmatic 
approaches 


This section deals with the two most important objectives of this paper: presenting the 
paradigmatic arrangements of a verb, thereby identifying the allomorphic variations and 
analyzing the phonological variations at various levels of morphological processes with the 
help of the syntagmatic approach. The data will be the same in this regard. (Table 3) presents 


the paradigms of the verb kar ‘do’. 


Table 3 — Verb kar ‘do’ in paradigmatic order 


Person DEDE doom Simple Past Past Imperfective Future 
kor-u kois-s-u koil-l-u kois-s-il-u kor-im 
1 do-1 do-ASP-1 do-PST-1 do-ASP-PST-1 do-FUT.1 
‘I do’ ‘I am doing’ ‘I did? ‘I was doing’ ‘I will do’ 
kər-a kois-s-a koil-l-I kois-s-il-I koir-b-I 
2.NH do-2.NH do-ASP-2.NH do-PST-2.C do-ASP-PST-2.NH do-FUT-2.NH 
‘You do’ “You are doing’ “You did’ “You were doing’ | ‘You will do’ 
kor-o kois-s-2 koil-l-a kois-s-il-a koir-b-a 
2.MH do-2.MH do-ASP-2.MH do-PST-2.MH do-ASP-PST-2.MH do-FUT-2.MH 
‘You do’ “You are doing’ ‘You did’ “You were doing’ | ‘You will do’ 
kor-e kois-s-i koil-l-ak kois-s-il koir-b-o 
2.HH do-2.HH do-ASP-2.HH do-PST-2.HH do-ASP-PST.2.HH do-FUT-2.HH 
“You do’ “You are doing’ ‘You did’ “You were doing’ | ‘You will do’ 
kor-e kois-s-i koil-l-ak kois-s-il koir-b-o 
3 do-3 do-ASP-3 do-PST-3 do-ASP-PST.3 do-FUT-3 
‘S/he does’ | ‘S/he is doing’ ‘S/he did’ ‘S/he was doing’ ‘S/he will do’ 


Table 3 presents the paradigms of the verb kər. It helps to identify the alternation of the 
root in relation to the different suffixes. Those varied forms are categorised as the allomorphs 
of kor. They are- kar, koir, koil and kois. 

We have chosen a few forms of the same verb kar, i.e. the 1* person present imperfective 
in column (a), simple past in column (b), past imperfective in column (c) and 1* and 2"4 
person non-honorific of the future forms from (Table 3) in column (d) and (e) respectively, to 
analyse the morphophonemic variations by putting them in the syntagmatic order. Below, in 
Table 4 they are shown using cursors. 


Table 4 — Morphophonemic variation in the verb kar ‘do’ 


(a) (b) (c) (d) (e) 
kor-js-u kor-jl-u kor-js-il-u kor-jb-i kor-im 
koip-s-u koip-l-u koi(-s-il-u koitr-b-i korim 
kois-s-u koil-l-u kois-s-il-u koir-b-i 

koissu koillu koissilu koirbi 


The variations and processes observed are: 
e Metathesis of i and r in (a), (b), (c) and (d) 


e 270i70i 
e Assimilation of r to s in (a) and (c) and r to / in (a) and (b) resulting in ss and // 


geminations. 
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e rand b are forming a consonant cluster in (d). The rule of assimilation does not apply 
here because the alveolar flap r is followed by a bilabial plosive b, instead of another 
alveolar or a lateral sound. 

e Neither the process of metathesis, nor assimilation has taken place in (e) since the 


fut.1 suffix -im is not followed by any other vowel. 


Having analysed a r ending verb, i.e. a verb with a flap in the final position, we will now 
look at the paradigm of a verb with a voiced alveolar fricative z in the final position to see 
what kind of morphophonemic variations takes place. Table 5 presents the paradigm of the 
verb baz ‘fry’. 


Table 5 — verb baz ‘fry’ in paradigmatic order 


Person Be E SE s Past Imperfective Future 
baz-u bait-s-u baiz-l-u bait-s-il-u baz-im 
1 fry-1 fry-ASP-1 fry-PST-1 fry-ASP-PST-1 fry-FUT.1 
‘I fry’ ‘I am frying’ ‘I fried’ ‘I was frying’ ‘I will fry’ 
baz-a bait-s-a baiz-l-I bait-s-il-i baiz-b-i 
2.NH fry-2.NH fry-ASP-2.NH fry-PST-2.NH | fry-ASP-PST-2.NH | fry-FUT-2.NH 
“You fry’ ‘You are frying’ ‘You fried’ | ‘You were frying’ | ‘You will fry’ 
baz-o bait-s-o baiz-l-a bait-s-il-a baiz-b-a 
2.MH fry-2.MH fry-ASP-2.MH fry-PST-2.MH | fry-ASP-PST-2.MH | fry-FUT-2.MH 
*You fry' “You are frying’ “You fried’ “You were frying’ | ‘You will fry’ 
baz-e bait-s-i baiz-l-ak bait-s-il baiz-b-o 
2.HH fry-2.HH fry-ASP-2.HH fry-PST-2.HH | fry-ASP-PST.2.HH | fry-FUT-2.HH 
*You fry' “You are frying’ ‘You fried’ “You were frying’ | ‘You will fry’ 
baz-e bait-s-i baiz-l-ak bait-s-il baiz-b-o 
3 fry-3 fry-ASP-3 fry-PST-3 fry-ASP-PST.3 fry-FUT-3 
‘S/he fries’ ‘S/he is frying’ ‘S/he fried’ ‘S/he was frying’ | ‘S/he will fry’ 


The allomorphs of baz ‘fry’ are baz, baiz and bait. 
The 1* person present imperfective, simple past, past imperfective and 1* and 2*4 person 
non-honorific of the future forms from Table 5 are chosen to be analysed in Table 6. 


Table 6 - Morphophonemic variation in the verb baz ‘fry’ 


(a) (b) (c) (d) (e) 
baz-is-u baz-il-u baz-is-il-u baz-ib-i baz-im 
baiz-s-u baiz-l-u baiz-s-il-u baiz-b-i bazim 
bait-s-u baiz-l-u bait-s-il-u baizbi 

baitsu baizlu baitsilu 


The morphophonemic variations and processes observed are as follows: 
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Metathesis of i and z in (a), (b), (c) and (d) 
zs>ts in (a) and (c) 
a7 ai 
Formation of clusters z/ in (b) and zb in (d). 


7. Verbal morphophonemics in Kamrupi Assamese 


e In (e), the FUT.1 suffix -im is not followed by another vowel. As a result, the 
process of metathesis of i and z does not take place as it does in (a), (b), (c) and 


(d). 


The next verb we are going to deal with has a nasal sound in its final position; the verb kin 
‘buy’. Among all the verbs that are analysed in this paper, this verb shows the least 
allomorphic variations. This is because, the vowel sound of the verb and the initial sound of 
the following tense and aspect suffixes are identical, i.e. i. This results in the merging of both 
these ‘i’ sounds into each other, thus delimiting the possibility of allomorphic variability. 


Table 7 — Verb kin ‘buy’ in paradigmatic order 


Simple Present : Past 
Kerson Present Imperfective Simple bane Imperfective punire 
kin-u kin-s-u kin-l-u kin-s-il-u kin-im 
1 buy-1 buy-ASP-1 buy-PST-1 buy-ASP-PST-1 buy-FUT.1 
‘I buy’ ‘I am buying’ ‘I bought’ ‘I was buying’ ‘I will buy? 
kin-a kin-s-a kin-l-I kin-s-il-i kin-b-i 
2.NH buy-2.NH buy-ASP-2.NH buy-PST-2.NH | buy-ASP-PST-2.NH | buy-FUT-2.NH 
“You buy’ | ‘You are buying? | ‘You bought’ | ‘You were buying’ | ‘You will buy’ 
kin-o kin-s-2 kin-l-a kin-s-il-a kin-b-a 
2.MH buy-2.MH buy-ASP-2.MH buy-PST-2.MH | buy-ASP-PST-2.MH | buy-FUT-2.MH 
“You buy’ | ‘You are buying’ | ‘You bought’ | ‘You were buying’ | ‘You will buy’ 
kin-e kin-s-i kin-l-ak kin-s-il kin-b-o 
2.HH buy-2.HH buy-ASP-2.HH buy-PST-2.HH | buy-ASP-PST.2.HH | buy-FUT-2.HH 
‘You buy’ | ‘You are buying’ | ‘You bought’ | ‘You were buying’ | ‘You will buy’ 
kin-e kin-s-i kin-l-ak kin-s-il kin-b-o 
3 buy-3 buy-ASP-3 buy-PST-3 buy-ASP-PST.3 buy-FUT-3 
‘S/he buys’ | ‘S/he is buying? ‘S/he bought’ ‘S/he was buying’ | ‘S/he will buy’ 


The verbs analysed so far show more than one allomorphic variations. But Table 7 
presents a verb with only one allomorphic variation -kin. Unlike the previous verbs, the 
monophthong of this verb has not turned into a diphthong, but has got merged into the initial 
vowel of the following suffix since they are the same. 

Let us look at the analysis of the 1* person present imperfective, simple past, past 
imperfective and 1*' and 2™ person non-honorific of the future forms picked up from Table 7 
in the syntagmatic order in example Table 8. 


Table 8 - Morphophonemic variation in the verb kin ‘buy’ 


(a) 


kin-is-u 
kün-s-u 


kinsu 


(b) 


kin-il-u 
kün-l-u 


} 


kanlu 


(c) 


kin-is-il-u 
kiin-s-il-u 


kinsilu 


(d) 
kin-ib-i 


kiin-b-i 


kinbi 


(e) 


kin-im 


kinim 


The variations observed in Table 8 are: 
As a result of the metathesis of i and n in (a), (b), (c) and (d) there takes place an ii 


construction. 
i>i 


Formation of clusters ns in (a) and (c); nl in (b) and nb in (d) 
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e The process of metathesis does not take place in (e) since the FUT.1 suffix -im is not 
followed by another vowel. 


We have looked at the verbs with a single vowel so far and noticed how a monophthong 
turns into a diphthong as a result of the morphophonemic processes. In some cases, as we 
noticed in case of the verb presented in Table 7, a monophthong does not always change into 
a diphthong. The vowel plays the key role in this regards. Let us move onto a verb with 


diphthong, daur ‘run’, the paradigms for which are presented in Table 9. 


Table 9 - Verb daur ‘run’ in paradigmatic order 


Person pimple Beene Simple Past | Past Imperfective Future 
Present Imperfective 
daur-u dauis-s-u dauil-l-u dauis-s-l-u daur-im 
1 run-1 run-ASP-1 run-PST-1 run-ASP-PST-1 run-FUT.1 
‘I run’ ‘I am running’ ‘Iran’ ‘I was running’ ‘I will run? 
daur-a dauis-s-a dauil-I-I dauis-s-l-i dauir-b-i 
2.NH run-2.NH run-ASP-2.NH run-PST-2.NH run-ASP-PST-2.NH run-FUT-2.NH 
‘You run? | ‘You are running’ “You ran’ “You were running’ | ‘You will run’ 
daur-o dauis-s-2 dauil-l-a dauis-s-l-a dauir-b-a 
2.MH run-2.MH run-ASP-2.MH run-PST-2.MH | run-ASP-PST-2.MH run-FUT-2.MH 
*'Yourun' | ‘You are running’ “You ran’ “You were running’ | ‘You will run’ 
daur-e dauis-s-i dauil-l-ak dauis-s-il dauir-b-o 
2.HH run-2.HH run-ASP-2.HH run-PST-2.HH run-ASP-PST.2.HH run-FUT-2.HH 
*'Yourun' | ‘You are running’ “You ran’ “You were running’ | ‘You will run’ 
daur-e dauis-s-i dauil-l-ak dauis-s-il dauir-b-o 
3 run-3 run-ASP-3 run-PST-3 run-ASP-PST.3 run-FUT-3 
‘S/he runs’ | ‘S/he is running’ ‘S/he ran’ ‘S/he was running’ | ‘S/he will run’ 


The allomorphs of the verb daur ‘run’ are daur, dauis, dauir and dauil. 


In Table 10, the 1* person present imperfective, simple past, past imperfective and 1*' and 
2™4 person non-honorific of the future forms picked up from Table 9 are analysed. 


Table 10 — Morphophonemic variation in the verb daur ‘run’ 


(a) (b) (c) (d) (e) 
daur-js-u daur-jl-u daur-is-il-u daur-jb-i daur-im 
dauir-s-u dauir-l-u dauir-s-il-u dauir-b-i daurim 
dauis-s-u dauil-l-u dauis-s-il-u dauirbi 

dauissu dauillu dauissilu 


The variations observed are as follows: 

e  Metathesis of i and r in (a), (b), (c) and (d). 

The diphthong au becomes a triphthong aui 

Assimilation of r to s in (a) and (c) and r to / in (b) resulting in ss and // gemination. 
Formation of a rb cluster in (d) 

The process of metathesis does not take place in (e) since the fut.1 suffix -im is not 
followed by another vowel. 
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The final verb we deal with is ah ‘come’ that carries a more complex level of variations 
than the verbs we have dealt with so far. Besides showing some common variations as shown 
by the previous verbs, this verb presents some other eye catching alterations. It will be wrong 
to assume that only the final sound ‘h’ plays the key role in this regards, because we will 
present another ‘h’ ending verb in section (3.1) which will present some exceptions to the 
forms of the verb ah ‘come’. Table 7 presents the paradigms of ah ‘come’. 


Table 11 - Verb ah ‘come’ in paradigmatic order 


Person Simple Present Simple Past Past Imperfective Future 
Present Imperfective 
ah-u ai-s-u aih-l-u ai-s-l-u ah-im 
1 come-1 come-ASP-1 come-PST-1 come-ASP-PST-1 come-FUT.1 
‘I come’ ‘Lam coming’ ‘I came’ ‘I was coming’ ‘Iwill come’ 
ah-a ai-s-a aih-l-i ai-s-l-i ai-b-i 
2.NH come-2.NH | come-ASP-2.NH come-PST-2.NH | come-ASP-PST-2.NH | come-FUT-2.NH 
“You come’ | ‘Youare coming? “You came’ “You were coming’ | ‘You will come’ 
ah-o ai-s-2 aih-l-a ai-s-l-a ai-b-a 
2.MH come-2.MH | come-ASP-2.MH come-PST-2.MH | come-ASP-PST-2.MH | come-FUT-2.MH 
“You come’ | ‘You are coming? “You came’ “You were coming! | ‘You will come? 
ah-e ai-s-i aih-l-ak ai-s-il ai-b-o 
2.HH come-2.HH come-ASP-2.HH | come-PST-2.HH | come-ASP-PST.2.HH | come-FUT-2.HH 
“You come’ | ‘Youare coming’ “You came’ ‘You were coming’ | ‘You will come’ 
ah-e ai-s-i aih-l-ak ai-s-il ai-b-o 
3 come-3 come-ASP-3 come-PST-3 come-ASP-PST.3 come-FUT-3 
‘S/he comes’ | ‘S/he is coming’ ‘S/he came’ ‘S/he was coming’ | ‘S/he will come’ 


The allomorphs are ah, aih and ai. 
The 1* person present imperfective, simple past, past imperfective and 1* and 2*4 person 
non-honorific of the future forms chosen from Table 11 are analysed in Table 12. 


Table 12 — Morphophonemic variation in the verb ah ‘come’ 


(a) (b) (c) (d) (e) 
ah-is-u ah-il-u ah-is-il-u ah-ib-i ah-im 
aih-s-u aih-l-u aih-s-il-u aih-b-i ahim 

ai-s-u aihlu ai-is-l-u ai-b-i 
aisu aihlu aislu aibi 


The variations are as follows: 


Metathesis of i and A in (a), (b), (c) and (d). 

h gets omitted, it might be influenced by s in (a) and (c) and by b in (d). 
Metathesis of i of -i/ and s of -is in (c) 

i>i 

Formation of cluster Al in (b) and ei in (c). 
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3. Theideal environments 


In section 1.3, we looked at the verb-suffix sequences. All those sequences are not ideal 
environments for the morphophonemic variations. This means, the phonological changes do 
not always take place in all the environments. Not all, but only the consonant ending verbs are 
the ideal candidates in this regard and also, only certain sequences form an ideal environment 
for morphophonemic variations. But this must be admitted that sometimes even though a verb 
and its suffixes are in an ideal sequence, it may show some exception to the phonological 
behaviour of other phonologically similar verbs in similar environments. The verb-suffix 
sequences that form ideal environments for morphophonemic variations are as follows: 


e  Verb-aspect-person 
e  Verb-tense(pst)/(fut)-person marker 
e  Verb-aspect-tense(pst)-overt person marker/e person marker 


3.1. Exceptions 


Our analysis till now has shown that if the verb and its suffixes occur in a certain pattern, the 
morphophonemic variations take place. But this observation is not devoid of exception. One 
of the exceptions is shown in example (5) which presents some forms of the verb mus *wipe'. 
In examples (5) and (6), the 1* person present imperfective and past indefinite forms are 
compared. 


(5) (a) (b) (c) 
mus-is-u mus-is-u mus-il-u 
muis-s-u musisu muis-l-u 
*muissu ‘I am wiping’ muislu 

‘I wiped’ 
*musilu 

(6) (a) (b) (c) 
baz-is-u baz-is-u baz-il-u 
bait-s-u baz-is-u baiz-l-u 
baitsu *hazisu baizlu 
‘I am frying? ‘I fried? 

*hazilu 


Unlike the verbs we previously dealt with, the process of metathesis does not take place 
between the final alveolar fricative s of the verb mus ‘wipe’ and the initial vowel i of the 
aspect marker -is. But the initial vowel i of the past tense marker -i/ and the verb final 
consonant s do exchange their positions, i.e. metathesis does occur. Hence, a form like * 
muissu is not possible, but we get a form musisu ‘I am wiping’ as opposed to the forms of the 
verb baz ‘fry’ as shown in Table 6. Both mus ‘wipe’ and baz ‘fry’ have an alveolar fricative in 
their final positions, although the one in mus ‘wipe’ is voiceless and the one in baz ‘fry’ is 
voiced. Again, Table 12 shows that h of ah gets omitted when it is immediately followed by i. 
But another / ending verb kah ‘cough’ shows some exception in this regard as shown in 
example (7). 
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(7) kah ‘cough’ 


(a) (b) 

kah-is-u kah-is-il-u 

kaih-s-u kaih-s-l-u 

kaihsu kaiih-s-l-u 

* kaisu kaihslu 

‘I am coughing’ *kaislu 

‘I coughed’ 

(8) ah ‘come’ 

(a) (b) 

ah-is-u ah-is-il-u 

aih-s-u aih-s-il-u 

ai-s-u ai-is-l-u 

aisu aislu 

‘I am coming’ ‘I came’ 


Example (7), when compared to the forms of a phonologically similar verb ah ‘come’ 
presented in example (8) shows some notable exceptions. In the forms of kah ‘cough’ cited 
here, the final consonant h does not get fully omitted, whereas in the forms of ah ‘come’ it 
does. So, forms like aisu or aislu are possible whereas *kaisu or *kaislu are not. Instead, we 
get the forms- kaihsu or kaihslu. 


4. Conclusion 


This paper is a modest attempt at bringing some aspects of the morphophonemic variations in 
the Nalbariya Kamrupi variety of Assamese into light. At the very outset, we mentioned that 
our analysis covers only the verbal morphophonemics. Moreover, we are dealing only with 
the finite verbal suffixes, excluding the prefixes and non-finite suffixes. Hence, many other 
facets of this area have remained unexplored in this paper. 

Let us restate that, in this paper we are not looking at the data in the light of the standard 
counterpart. This method turned quite effective in a systematic exploration of the amusing 
interplay of morphophonemics unique to the dialect. 

Tense and aspect are two problematic areas in both standard and Kamrupi variety. For 
instance, the aspect or the imperfective marker -s or -is does not always mean progression or 
incompletion but very often refers to completion of action, too. Also, its occurrence in the 
present tense does not always refer to actions taking place in the current time frame, but can 
refer to actions taken place in remote past, too. The co-occurrence of both imperfective and 
past tense marker -/ or -i/ involve a much more difficult level of interpretation since the same 
form may convey varying number of meanings and can refer to different time frames. Hence, 
the translations for such forms as presented in the tables of paradigms should not be 
considered as the final ones. We have mentioned only one of the possible meanings of the 
forms there. 

Some of the appealing areas for further research in Kamrupi morphophonemics are as 
follows: 

e Verb negational morphophonemics 

e Morphophonemic variations in the word classes other than verb 

e Morphophonemic variations of the non-finite verbs etc. 
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Abbreviations 
Ø Zero 
1 First Person 


2.HH Second Person high honorific 
2.MH Second Person mid honorific 

2.NH Second Person non honorific 

3 Third Person 

ASP Aspect 

FUT Future 

PL Plural 

PST Past 

SG Singular 
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groups throughout different geographical regions in south-western Myanmar and in Bangladesh. This 
study examines a dialect of Asho Chin spoken in the north-western region of Yangon, Myanmar. 
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singular), na= (second person, singular), ?a= (third person, singular, used with transitive verbs only), 


and má- (plural). Initially we will look at the morphological and syntactic features of person marking, 
and then briefly describe the inverse marking system in Asho Chin. 
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1. Profile of the language 
1.1. Location, genetic affiliation and number of speakers 


Asho Chin (ISO 639-3: csh) a.k.a. Plains Chin" is a Tibeto-Burman language that belongs to 
the southern Chin sub-branch of the Kuki-Chin branch (cf. Bradley 1997). The language has 
several alternate names such as Khyang (Bernot and Bernot 1958) and Ashó (Baptist Board of 
Publications 1998 [1952]). It is mainly spoken in the Ayeyarwaddy river lowlands and 
Rakhine mountain areas in southwest Myanmar, as illustrated in Figure 1. The total 
population of Asho Chin speakers is estimated to be 34,000 (Lewis et al. 2015). 

The Asho Chin speaking areas are scattered throughout south-western Myanmar. 
According to Grierson, there are at least two dialects, i.e., the northern dialect spoken in the 
Chittagon hill tracts and the southern dialect spoken in Rakhine State (Grierson 1904: 341— 
342). VanBik suggests that Asho Chin may have as many as six dialects, most of which are 
mutually intelligible. Their names and the places where they are spoken are as follows: Settu 
(Sittwe to Thandwe — mostly Sittwe to Ann), Laitu (Sedouttaya Township), Awttu (Mindon 
Township), Kowntu (Ngaphe, Minhla, Minbu), Kaitu (Pegu, Mandalay, Magwe etc.), and 
Lautu (Nyetone, Kyauk Phyu, Ann) (VanBik 2009: 37-38). This paper mainly deals with the 
Asho Chin language which is spoken in Insein Township, the north-western area of Yangon. 

Asho Chin has its own orthography based on the Pwo Karen alphabet. This orthography 
has been primarily adopted by many local Baptists and some Buddhists. A translation of the 
New Testament and a primer (Baptist Board of Publications 1998 [1952]) written in the Asho 
Chin alphabet were published in the middle of the 20th century. Many Asho Chin children 


! I am deeply grateful to Mr. Salat Kyaw Htwe Hercules, who provided me with helpful comments and 
suggestions as well as a lot of precious data. Any errors that may remain are my own. This work was supported 
by JSPS Grant-in-Aid for JSPS Fellows. 
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today learn how to read and write the Asho Chin language at Sunday schools or Buddhist 
temples. 

In February 2012, Asho Chin National Party (ACNP) formally requested that the Election 
Commissioner recognize ACNP for the Chins who are not included in the present Chin State 


PRC 


Laos 


Thailand 
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Figure 1 — Asho Chin speaking area 


of Myanmar. The Asho Chin's party was finally granted permission for the first time in 
history to register as a political party in June 2012. The party recently started its political 
campaign and field trip throughout the Asho Chin-speaking area. Meanwhile, the Asho Chin 
Literature and Culture Central Committee has also been founded to promote their language 
education and traditional culture beyond the limits of religion and region. The committee 
regularly publishes bilingual magazines such as *Asho's Light” /?aea?dwdsozon/, and helps 
bring Asho Chin primers and dictionaries to publication. 

My consultant, Mr. Salai Kyaw Htwe Hercules (socóio$wo: /s"álàrcót"wér/, born in 
Yangon in 1962), is a committee member who is currently studying the Asho Chin traditional 
clan system at the Asho Baptist Church in Insein Township. He is very fluent in both Asho 
Chin and Burmese so that we mainly communicate with each other in Burmese. 


1.2. Asho Chin and Burmese (Myanmar) 


The Burmese exonym “Chin” may have been derived from the language of Asho Chin, the 
ethnic group with whom the Burmese first made contact. In Asho Chin, the word for “person” 
is /k'laun/. When the Burmese met the Asho Chin, they must have taken the word /K"láow/ to 
refer to them (VanBik 2009: 4). The Burmese had already lost the D cluster, so that the 
closest approximation they could use was K^y-; thus the term k’vay in written Burmese, or the 
alternative name fe^? (conventionally spelled *Chin") in colloquial modern Burmese, appeared 
to designate any Chin group. 

Under the strong influence of Burmese culture and language, Asho Chin has adopted a 
number of loan words from Burmese, including nouns (e.g., /Pdunji/ ‘shirt? < BURM: déit), 
verbs (e.g., /p^wáw/ ‘to open’ < BURM: p'wi), adverbials (e.g., /2dtódóÓ/ ‘together’ < BURM: 
Pdtudu), and even some grammatical particles (e.g., /=6a/ ‘polite particle’ < BURM: pa). 
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1.3. Phonology 


This section presents the phonology of Asho Chin that the author has described through 
fieldwork. A majority of Asho Chin morphemes are phonologically monosyllabic (eg. /Jé/ 
‘head’) or bisyllabic (e.g., /na?dau/ ‘child’). The monosyllabic structure can be represented as 
C1(C2)V1(V2)(C3) / T where C and V stand for a consonant and a vowel, respectively, and T 
indicates the tone of the whole syllable. A bisyllabic structure consists of two monosyllabic 
structures. 

Consonant phonemes are: /p, p^, b,6,t, t^, d, d; k,k^,g,?,c[te], c^ [te^], j [dz] , s, 
$*,z,e,h, fi, m,n, j,g,hm [mm], hn [pn], hn [pn] , hy [ġo] , 1, hl [I], r [3] w , y BY. 
Asho Chin appears to have lost a variety of final consonants in comparison to other Chin 
languages such as Daai Chin (So-Hartmann 2009), Mizo (Chhangte 1993) and Tiddim Chin 
(Henderson 1965). Asho Chin thus resembles the phonology of the lingua franca Burmese. C2 
is restricted to /w, y, l/. 

Vowel phonemes are: /i, 1, €, a, a, 2, 0, U, u, er, ar, aof. There are also the set of nazalised 
vowels, where /n/ indicates the nazalization of the preceeding vowel, such as /aw/ [à]. Vowel 
length, or duration, is not a distinctive feature for Asho Chin vowels. 

There are two distinctive tones, 1.e., a high tone / '/ [1] and a low tone/ / [1]. The 
actual pitch contour of each tone, however, may vary phonetically due to intonation. A falling 
pitch [V] often occurs in actual conversation, which may be an allotone of the low tone / `/. In 
addition, there is an atonic syllable: Că. 


1.4. Typological features 


From a typological point of view, Asho Chin is a predicate-final language and its unmarked 
word order is SV in an intransitive clause, as in (1), and AOV in a transitive clause, as in (2) 
and (3). Asho Chin is an agglutinative language, and the grammatical relation between a verb 
and its arguments is generally represented by various grammatical particles (clitics). Asho 
Chin has an ergative case marker =na? and a primary object marker =ha~=ya~=ka. 


(1) cér p'0=ha (ka=)si?=ka? 
ISG 3SG=COM (1SG=)go=REAL 
‘I went with him.’ 


(2) wi=naP k'laun=ya (?G=)s0=haP 
dog=ERG human=OBJ (3SG=)bite=REAL 
* A dog bit a man.’ 

(3)  pye?p'yaà-na? pas'én=yna | námó mlo??é1dun=lwi (?d—)pàr?—ká? 
PN=ERG PN—OBJ DEM toy=PL (3SG=)give=REAL 


*Pyay Phyoe gave those toys to Pasen.’ 


Verbs can be defined as words that may be followed by the modal particles =ha? (REAL) 
and —/iár (IRR). Asho Chin's finite verbs usually co-occur with various modal markers in main 
clauses. In most assertive sentences, either a realis marker =Ad? or an irrealis marker —fiár 
follows the predicate verb phrase, as seen in (4) and (5). 
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(4) só-fhá? 
BURM:gather-REAL 
‘have gathered’ [Realis] 


(5)  só-hár 
BURM:gather-IRR 
*be going to gather' [Irrealis] 


Some of the grammatical particles with an initial consonant A, such as the modal particles 
—fià? (REAL) and —/idr (IRR) above, undergo the simple phonological rule of assimilation, as 
illustrated below. 


fik (or Jud 


(6) a. hmá?-ka? 
BURM:know-REAL 


b. p'ó?-kár 
BURM:read-IRR 


(7) a. WáüUN-n9á? 
BURM:enter-REAL 


b. p'wan=ndi 
BURM:open=IRR 


1.5. Verb stem alternation 


Most of the Chin languages have been reported to exhibit a unique verb stem alternation 
system, where each verb has two stems. Overall, the verb stem alternation may be 
synchronically related to transitivity and nominalization. The morphological description of 
the two alternative forms and the linguistic function of the choice between the two forms are 
language-dependent and too complicated to be examined in detail here; therefore we are not 
concerned here with these matters. A detailed discussion about the Kuki-Chin verb alternation 
system can be found in Henderson (1965: 84-9), Chhangte (1993: 135-175), and So- 
Hartmann (2009: 97—107). 

Hyman and VanBik (2002) report that in Hakha Lai (Central Chin), 80% of verbs have 
two distinct forms. In Daai Chin (Southern Chin), however, less than 20% of verbs exhibit 
this stem alternation according to So-Hartmann (2009). Thus, we surmise that verb stem 
alternation affects a minority of verbs in Asho Chin as well. In this paper, the verb stem 
alternations are not indicated in glosses because of the lack of my data. 

There are some verb pairs where two similar verbs share the same semantic meaning, as 
seen in (8) and (9); however, further investigation is necessary to see whether these pairs are 
related to verb stem alternation. 
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(8) a. 5'52—k3? 
look-REAL 
‘Someone looked.’ 


b. s"3=heér 
look=IMP 
‘Look!’ 


(9) a. siP=kae 
go=REAL 
‘Someone went.’ (No restriction.) 


b. st=hd? 
gO=REAL 
* tYou/Ij went.’ (The agent is restricted to the speech-act participant.) 


1.6. Previous studies 


Before proceeding to my own observations, let us devote some space to discuss previous 
studies. To my knowledge, there are only a few previous studies on the Asho Chin language. 
The linguistic overview and basic word list of the Sandoway (Thandwe) dialect were prepared 
by Fryer (1875). Houghton (1892, 1895) showed some example sentences in Magwe dialect, 
adding a basic word list as an appendix. The previous works on Asho Chin dialects were later 
summarized by Grierson (1904: 331—346). Note that the phonemic symbols in the previous 
studies are somewhat unique, and that their method of describing morphosyntax is different 
from the one adopted in this paper. 


2. Person marking systems in Asho Chin 


There are two major types of personal pronouns in Asho Chin, i.e., independent personal 
pronouns (Table 1) and clitic pronouns (Table 2). Gender is not distinctive in the Asho Chin 
person marking system. The use of independent personal pronouns and clitic pronouns is not 
compulsory, thus they are often omitted as long as they are predictable from context. 


2.1. Independent personal pronouns 

The set of independent personal pronouns is shown in Table 1. The dual pronouns are seldom 
used; thus, it is still questionable whether the category of dual number should be set in 
modern Asho Chin’s person marking system, although previous studies such as Fryer (1875) 
and Houghton (1895) claim that Asho Chin does contain the pronouns in the dual form. 


Table 1 - Independent personal pronouns 


SG DU PL 
1 cer céthni cáméi/?áméi (INC) 
NAUN naonhni naunmel 
3 yarlp"o yaehni!Payarhni/nahwer yarmerlp"omer/nahé 


The pronoun in the first or second person does not take the ergative marker =nd? as 
shown in the examples (10) and (11). 
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(10)  cér nàoN ` 2ds"áN ka=yauP=ka ka=k'aP=kaP 
ISG 2SG BURM:sound 1SG=hear=CONJIN 1SG=awake=REAL 
*[ heard your voice and woke up.’ 


(11) 2amer ?ac"anjan cézó táun=dùN 
1PL.INC BURM:each other BURM:gratitude BURM:put-NMLZ 


ma=sar=la=hé=har 
PL=make=MOD:DEO=PL=IRR 
‘We must express our gratitude to each other.’ 


As previously discussed, the dual pronouns are rarely used today; however, they 
occasionally appear in some folktales and traditional songs, as shown below. 


(12)  náhwér-há ná-só pá-»5? mwér-fià? 
3DU-LOC DU-son CL-one exist=REAL 
"The couple has a son.’ 


(13) cémní | PàuP??ó^?dpá ^ dó-ná tunt+mo+saunt+k'lo=ha 
IDU mothert+father ` die-if millet+master+rice+spirit=0BJ 


ná—wáownér-hár 
DU-enter-IRR 
*If both of us, mother and father, die, we will go into the God of agriculture." 


2.2. Clitic pronouns 


Clitic pronouns in Asho Chin (Table 2), which precede either a VP or an NP, may be related 
to the Kuki-Chin verbal agreement system traditionally known as *pronominalization", where 
the verbal affixes or clitics are assumed to have been derived from independent pronouns 
(Van Driem 1993). As already mentioned above (cf. (13)), the dual clitic pronoun nd= (DU=) 
is seldom used at present; thus the dual category is left out from the table below. 


Table 2 - Clitic pronouns 


TRANSITIVE INTRANSITIVE 
SG PL SG PL 
1 ka= mă= kă= ma= 
na= ma= na= ma= 
3 00 má- - ma= 
(14) 66 { ka= /| na= / ră= } Pér-hà? 
rice 1sG= 2sG- 3sG- eat-REAL 


‘{1/You/He} ate rice.’ 
(15)  cér-nà? pO má-si?—ká? 


1sG=and 390 PL=go=REAL 
‘I and he went. (He and I went.)’ 
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Clitic pronouns are always omitted for third-person singular subjects in intransitive 
clauses, as shown in (16). In such a case, the modal particles —//? (REAL) / =Adz (IRR) change 
from high tone to low tone if the underlying tone of the preceding syllable is high, as seen in 
(17) and (18). 


(16) zòn+thón=bò> { ka= / ná—- / *?i—- }  kwar=ha? 
mountain+above=ALL 1sG= e 3SG= climb=REAL 
"UN ou climb the mountain.’ 


(17) | zont+han=b5 kwár-hà? 
mountain+above=ALL climb=3 :REAL 
‘He climbs the mountain.’ 


(18)  só-hà yí-ná | pwár-für 
DEM-LOC sell=if — good-3:IRR 
‘If you sell it there, it will be good.’ 


Certain verbs change from an underlying low tone to high tone when they occur with a 
first or second person subject, as seen in (19).? 


(19 a. ka={ ló / *lo }=ha? 
1SG=come=REAL 
‘I came.’ 


b. nd={ ló [| *ló }=hd? 
2SG=come=REAL 
“You came.’ 


Aside from the clitic pronoun ma=, the plurality of verbs can also be indicated by 
postposing the optional plural marker =hé, which can also follow nouns as below. 


(20) nă=hè wo-plár-t'ó lo=hé=har 
3=PL Myanmar+BURM:country=ABL | come-PL-REAL 
"They came from Myanmar.’ 


(21) coénzo+connadv=hé=nar Papyunt+tala=go?r 
male.student+female.student=PL=ERG BURM:just+BURM:law=PP 


má-z3?»?ér-lá—hà? 
PL=learn=MOD:DEO=REAL 
‘Students have to learn justice.’ 


There are no enclitic pronouns or pronominal suffixes as found in Tiddim Chin (Otsuka 
2009: 199) or Hakha Lai (Peterson 2003: 415). In most Chin languages, the clitic pronouns 


7 This type of tone alternation is not always applied, and both tones can appear in some verbs as below. The 
complex tone alternation remains as a matter to be investigated and discussed further. 
cf. cer pio=ha ká-( hi | hi }=ha? 

ISG 3SG=OBJ 1 SG=ask=REAL 

‘I asked him.’ 
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also express both the person and number of the possessors, and are identical in form to the 
verbal subject agreement marker, as illustrated in (22) through (24). 


(22) kà- pa: ‘my father’ [Northern Chin: Tiddim Chin] 


(23) a. ka-paa ‘my father’ [Central Chin: Hakha Lai] (Peterson 2003: 414) 
b. ká-pàà ‘my father’ [Central Chin: Mizo] (Chhangte 1993: 171) 


(24)  kah- pa: ‘my father’ [Southern Chin: Daai Chin] (So-Hartmann 2009: 84) 


This proposition partly holds true for the clitic pronouns in Asho Chin, as seen in (25) and 
(26). The underlying low tone in the only or last syllable of the possessee noun is changed to 
high tone if preceded by an independent pronoun as shown in (26). The clitic pronouns, 
however, do not function as possessive markers to some inalienable nouns, as shown in (27). 
Note that there is no formal indication of possession other than using the clitic pronouns or 
juxtaposition of two nominals, where the possessive NP precedes the head noun. 


(25) a. ka= ló 
lsG= head 
*my head" 


b. cél ló 
190 head 
‘my head’ 


(26) a. ka= po 
IsG- father 
*my father’ 


b. cél po 
Isa Poss:father 
*my father? 


(27). a ka= nari 
IsG- ` watch 
‘my watch’ 


b. cél nárí 
1sG watch 
*my watch' 


3. Person marking systems in Asho Chin 


The inverse marker mă- is often attached to a transitive verb if the undergoer such as a 
patient, a recipient, a causee, a beneficiary and a concomitant with the comitative applicative 
-pwi | -bwi ? is a speech-act participant, i.e., a speaker and/or a listener. The following 


? [n terms of semantic macro-roles (Foley and Van Valin 1984: 29-30), the ‘undergoer’ can be characterized as 
the argument which expresses a participant who does not perform, initiate, or control any situation but rather is 
affected in some way. 
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voiceless unaspirated initial consonant /p, t, k, c, ei alternates with its voiced conterpart /b, d, 
g, j, z/. The underlying low tone also alternates with the high tone. 


(28)  p'yó?—n3? má-zó-—fiá? 
BURM:gnat-ERG INV-INV:bite=REAL 
"A gnat bit us.’ (cf. só ‘to bite’) [PATIENT: 1PL] 


(29)  pás'év—na? cér-fiá SOPUP má-bár?—ká? 
PN=ERG ISG-OBJ  BURM:book _ INV-INV:give=REAL 
‘Pasen gave me a book.’ (cf. pai? ‘to give’) [RECIPIENT: 1SG] 


(30) Pattwer mă-hlér=dá?=kő? 
chickent+egg INV-buy=CAUS=REAL 
‘I was / You were made to buy an egg.’ [CAUSEE: 1sG/2sG] 


(31)  2ér-wár-ná tud | má-jáo*pái?—kár-lé 
eat=desire=if now  INV-INV:fry-give-IRR-SFP 
‘If you want to eat it, I will fry it for you.’ (cf. cáo ‘to fry’) [BENEFICIARY: 2SG] 


(32)  s'álàr ja=naP naun=ya ` Do má-Pér-bwi-fió?—mà 
Mr. Ja=ERG 2SG=OBJ meal INV-eat-COM=REAL=Q 
‘Did Mr. Ja have a meal with you?’ [CONCOMITANT: 2SG] 


The marker mă- also indicates that the undergoer is a speech-act participant in such 
sentences where the undergoer NP is not explicit as below (Otsuka 2014). 


(33)  yó-nà? má-P?ó-nào?—k3á? 
rain=ERGINV-INV:fall-TRNS=REAL 
‘It rained on me.’ (cf. do ‘to fall’) 


The inverse marker mă- is identical with the plural clitic pronoun ma= (cf. Table 2) in 
form, however, it differs from the clitic pronoun in that the inverse marker requires the 
consonant voicing (e.g., (34) b.) and the tone alternation (e.g., (35) b.) on the following 
element’. 


(34) a. má-póhár-hiá? 
PL=surprise=REAL 
"They surprised (someone).’ 


^ Such voicing and tone alternation are also found in negative clauses, as seen in the examples (a) and (b) below. 
The negative is expressed by attaching the negative marker —/á? (or =hà in some subordinate and prohibitive 
clauses) to a verb. Negative clauses are unmarked by agreement. 


(a) cer Pücà-hlü báo-t'ér-hów-lá? 
1sG Asho-ESS NEG:speak-MOD:DYN-ASP-NEG 
‘I cannot speak Asho yet.’ (cf. páo ‘to speak") 

(b) cel Payobi=ha g2Nn=la? 
ISG noon=LOC NEG:free=NEG 


* am not free at noon.’ (cf. kòn ‘to be free”) 
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b. ma-bohar=haP 
INV-INV:surprise=REAL 
"1 I was / You were L surprised.’ (cf. pohár ‘to be surprised’) 


(35) a. má-wi-há? 
PL=call=REAL 
"1 We/ You (pl.) / They } called (someone).’ 


b. má-wí-há? 
INV-INV :call-REAL 
* 11 was/You were} called.’ (cf. wi ‘to call’) 


In addition, note that no clitic pronouns, except for the first person singular clitic ka=, 
occur with the inverse marker md-: e.g., kă=mă- <1SG=INV-> / *nd=md- <2SG=INV-> / 
*Pa=md- <3SG=INV-> / *ma=md- <PL=INV->. The example with ka=md- <1SG=INV-> is 
shown below. 


(36)  cér nadun=ya paonhmon kă=mă-bár?=kő? 
ISG 2SG=OBJ BURM:bread 1SG=INV-INV:give=REAL 
‘I gave you some bread.’ (cf. pàr? ‘to give’) 


The prefix do. is attached to a verb without any voicing and tone alternation if the 
undergoer is a speaker in an imperative clause as seen in (37) or, alternatively, the tone 
alternation occurs where the underlying tone is changed to the high tone as shown in (38). 


(37)  cér-há-k'5 ?Pa-parP=mo=s"In=yeI 
1SG=OBJ=also 2>1-give=stillI=MOD:DEO=IMP 
‘Give me more, too.’ (cf. par? ‘to give’) 


(38) mwér-mó-ná pái?—kéi 
exist=still=if give=IMP 
‘If you still have more, give them to him.’ (cf. pàr? ‘to give’) 


4. Conclusion 


This paper initially described a brief outline of the modern Asho Chin language and its socio- 
linguistic background, and then showed the person marking system in the latter section. Asho 
Chin has two types of personal pronouns; i.e., independent personal pronouns and clitic 
pronouns, similar forms of which can also be found in the other Chin languages. 

Some of the Central and Southern Chin languages, such as Mizo (Chhangte 1993), Hakha 
Lai (Peterson 1998, 2003), and Daai Chin (So-Hartmann 2009) have the clitic pronouns that 
agree with the grammatical object, as illustrated in (39). However, Asho Chin does not have 
such clitic pronouns. 


(39) ` Pan-kan-tho?y (Peterson 1998: 90) 
3PL:S-1PL:O-hit"! 
"They hit me.’ 


Instead, we found that Asho Chin has an inverse marking system, as seen in Figure 2. The 
arrow in the figure indicates the direction of the event: A — O. 
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- Speech-Act Participants - 


Figure 2 — Inverse marking in Asho Chin 


As DeLancey (1981) suggested that both 2nd>Ist ranking and Ist>2nd ranking are 
possible variations on the universal speech-act participant>3rd theme (1981: 644), the Ist and 
2nd person marking systems are rather complicated. 

A similar type of this inverse marking system can also be found in Tiddim Chin, a 
Northern Chin language, where the inverse marker (or the cislocative marker) óy- is 
obligatorily suffixed to a transitive verb 1f the undergoer is a speech-act participant (Otsuka 
2009). 


(40) a. kén lian SULT -in (Otsuka 2009: 205) 
1SG.ERG PN kick! =1SG.REAL 
‘I kicked Lian.’ [Tiddim Chin] 


b. aman ` [kéi/nágjóg- sui (Otsuka 2009: 205) 
3sG.ERG 1SG/2SG INV- kick! 
‘He kicked {me/you}.’ [Tiddim Chin] 


Although the use of the inverse or cislocative marker óy- in Tiddim Chin is obligatory, the 
inverse marker md- is not mandatorily attached to a verb in Asho Chin. The use of the inverse 
marker mă- may vary according to age and dialect in the modern Asho Chin language. Further 
investigation is needed to thoroughly comprehend the ongoing change of the inverse marking 
system as well as the person marking system in Asho Chin. 


Abbreviations 


1 first person 
2 second person 
3 third person 
X>Y direction from X toward Y 
* ungrammatical 
affix boundary 
= clitic boundary 
+ compound boundary 
S falling tone 
Form I verb stem 
Form II verb stem 


A the agent-like argument associated with prototypical transitive verbs 
ABL ablative 
ALL allative 
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ASP aspectual 

BURM loan word from Burmese 
CAUS causative 

CL classifier 

COM comitative 

CONJN conjunction 

DEM demonstrative 

DEO deontic 

DU dual 

DYN dynamic 

ERG ergative 

ESS essive 

IMP imperative 

INC inclusive 

INV inverse marker 

IRR irrealis 

LOC locative 

MOD modal 

NEG negative 

NMLZ nominalizer 

O the patient-like argument associated with prototypical transitive verbs 
OBJ objective 

PAST past 

PL plural 

PN proper noun 

POL polite 

POSS possessee 

PP pragmatic particle 

Q question marker 
REAL realis 

S the single argument associated with canonical intransitive verbs 
SFP sentence final particle 
SG singular 

TRNS transitivizer 
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1. Introduction 


Passive constructions are quite common in many of the world's languages. The proto-typical 
passive voice is used primarily for agent suppression or de-topicalization. The fact that a non- 
agent argument — most commonly the patient — is then topicalized is but the default 
consequence of agent suppression (Givon 2001:125). In the same way, Boro uses agent de- 
topicalized constructions to express passive meaning. These constructions which express 
passive meanings are what we will call the “passive-like constructions in Boro". These 
constructions resemble the passive in European languages like English. However, there are 
some unique characteristics of these constructions. These characteristics of the passive-like 
constructions are what we are going to discuss in this paper. We will provide a morpho- 
syntactic and semantic/pragmatic description of the passive-like constructions in Boro. In 
particular we will discuss the factors which characterize them, viz., ‘animacy’, ‘control’ and 
‘volition’. 

Boro expresses passive-like meanings by using za ‘happen, to take place’ as an auxiliary 
verb. 


(1) miusum-a miusa-zwum or-thar-za-bai 
COW-SUBJ tiger-INST bite-RES:death-happen-PRF 
"The cow was killed by the tiger.' 


The constructions which involve za are called ‘za constructions’ (Boro in press). These za 
constructions in Boro have a wide range of functions besides that of expressing a passive 
meaning. According to Boro, there are five major za constructions, viz. "spontaneous event 
constructions", "reciprocal/collective constructions", "facilitative constructions", "reflexive- 
causative" and "adversative constructions". In this paper, however, we will be talking about 


! The order of the authorship is alphabetical and is not based on the contribution that the authors have made to 
this publication. We would like to thank Prof. Scott DeLancey and Krishna Boro for their valuable suggestions 
on this work. 
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only those constructions which are specifically associated with the function of expressing a 
passive meaning. 

Moreover, this paper is the result of research undertaken at the same time as research by 
Krishna Boro which resulted in Boro (in press), a publication on the same topic with the title 
“The Za constructions: Middle-like constructions in Boro". While the research and write-up 
were carried out independently, some references to K. Boro's analysis have been added later 
on. K. Boro's paper has a larger scope than the present paper and discusses other Za 
constructions in addition to the ones discussed in our paper. The goal of our paper is to focus 
on the similarities and dissimilarities of a subset of the Za constructions with the English 
passive construction. 

The paper is structured in the following way:§2 provide information about the language 
and the source of data. $3 talks about the basic clause structure of Boro. In $4, we will deal 
with the different types of passive-like constructions and the basic similarities and 
dissimilarities with the English passive. This section is further divided into two sub-sections. 
In $4.1, we will discuss the basic similarities. We will talk about the dissimilarities of the 
passive-like constructions in$4.2. Section $5 concludes the paper. 


2. Language background 


Boro is a member of the Bodo-Konyak-Jinghpaw sub-branch of the Tibeto-Burman language 
family (Bradley 1997; Burling 2003). It is mainly spoken in the Brahmaputra valley of Assam 
in India. It is also spoken in some parts of Nepal, West Bengal and Nagaland. There are 
approximately 1,330,000 speakers in India (Lewis et al, 2013). 

Our study is based on the standard variety of Boro spoken in Kokrajhar. The data mostly 
come from conversations, narrations of personal experiences and some constructed sentences. 


3. Basic clause structure 


Like many other South-east Asian languages, Boro has an overt case marking system. 
Generally, the S and A arguments of basic constructions are marked with the nominative -a~ - 
w~ju suffix. The P-argument in a basic active clause is marked with an accusative kiou--Eu 
suffix. However, the case marking on the subject and object is quite complex. The markings 
occur under pragmatic conditions which are difficult to specify. Although the grammatical 
relations subject and object do play a role in Boro syntax, the “case” forms have a much more 
complex function than the grammatical cases of Indo-European languages, in that their 
occurrence is determined not simply by the syntactic grammatical relation of a noun phrase 
argument, but also by its pragmatic status as definite, topical, etc (DeLancey and Boro in 
preparation). 


(2)  zodu-a modu-k^u nu-duymun 
Jodu-SUBJ Modu-OBJ ` see-REALIS.PST 
*Jodu saw Modu.’ 


(3) bi-jui aphel-k^u za-bai 


3SG-SUBJ apple-OBJ eat-PRF 
‘He has eaten the apple.’ 
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(4) burai burwui got'o-k'u abra nu-na 
old.man old.woman boy-OBJ stupid see-NF 
t'aka hwu-ak^ui-swi 
money give-NEG.PRF-INT 


‘The old man and the old woman seeing the boy a fool did not give the money.’ 


(5) ay wunk^am za-dwy 
IsG rice eat-REALIS 
‘I am eating rice.’ 


As seen in the above given examples, the arguments are optionally marked. The A and P 
arguments in (2) and (3) are marked with the subject and the object marker respectively. 
However, in (4), the P argument is marked with -khu while the A argument is not marked. 
Similarly, in (5), both the arguments do not take any case marker. The optional marking on 
the A argument and P argument is based on certain pragmatic conditions. There is no clear 
description on the factors underlying the use of overt case markers in the existing literature. 


4. Passive-like constructions 


The passive-like constructions in Boro represent a subset of za constructions. The patients of 
these constructions take an active role in the initiation of the event. However, the participation 
of the patient can either be willful or against the will of the patient. This factor therefore, 
makes it inevitable for the patient to be animate. The effect of volition and animacy will be 
discussed in $3.1. Syntactically, the patient usually occurs in subject position and takes the 
subject marker -a~-ja~-w~-ju. The agent, which is optional, is marked with the instrumental 
marker, -zuy. Again, the marking of the agent with zw is optional in some contexts. We will 
discuss it further below. The lexical verb along with -za ‘happen/to take place’ forms the verb 
complex. 


(6) ram-a zodu-zwng rai-za-dwng 
Ram-SUBJ Zodu-INST scold-happen-REALIS 
'Ram was scolded by Zodu.' 


The event types in this construction are typically considered to indicate an adverse effect 
on the patient participant. These types of constructions are commonly known as the 
"adversative passive". However, it is important to note that not all passive constructions 
indicate an inherent negative effect on the patient. 

In this section we will talk about the passive-like constructions that are similar to the ones 
in English and also those which are different from English. We will provide morphosyntactic 
and semantic/pragmatic descriptions of those constructions. 


4.1. Basic similarities with European Passive constructions 


There is a subtype of the passive-like construction in Boro that bears a close resemblance to 
European passive constructions. The patient argument is topicalized, in the sense that it 
syntactically becomes the subject and is marked with the suffix -a~-ja, -w~-jw. On the other 
hand, the agent is suppressed and hence becomes optional. The agent, then takes the 


141 


North East Indian Linguistics 7 


instrumental marker -zwu. In the verb complex, the lexical verb occurs with -za. Consider the 
following examples. 


(7) anu bu-za-duy (bi-zu) 
1SG-SUBJ beat-happen-REALIS 3SG-INST 
* was beaten.’ 


(8) muswmu-a musa-zwun or-thar-za-bai. 
COW-SUBJ tiger-INST bite-RES:death-happen-PRF 
"The cow was killed by the tiger.’ 


(9) anu no-ao gar-lan-za-duymun 
1SG-SUBJ house-LOC leave-DISTAL-happen-REALIS.PST 
‘I was left alone in the house.’ 


In examples (7), (8), and (9) the subject marker -a~-w on the patients is optional and is 
based on pragmatic situations. The agent NPs are usually optional and take the instrumental 
marker -zwy. Adding to that, the verbs take the auxiliary suffix -za. As already mentioned 
earlier, all the patients in (7), (8) and (9) are animate and are affected in one way or another, 
which is not similar to European passive constructions. In (7), ay ‘T is beaten by someone. (8) 
is a typical example of a passive construction. Here, muswu ‘cow’ is killed by the musa 
‘tiger’. And in the next example, ay ‘I’ is unhappy about being left alone. Example (8) is a 
repetition of example (1). 


4.2. Dissimilarities 


Contrary to what we have seen in the above section, there are some passive-like constructions 
which are structurally different from the European passives. They are mainly the passive-like 
constructions with intransitive verbs and imperative passive-like constructions. 

The intransitive passive-like constructions defy the general property of basic passives 
which says that the main verb in its non-passive form is transitive, and the main verb 
expresses an action, taking agent subjects and patient objects in its non-passive form (Dryer 
and Keenan 2007). The interesting characteristic of these constructions is that there are no 
active counterparts to them. The intransitive passives have an inherent adverse or negative 
effect which can be interpreted as the consequence of someone else's action and the 
maleficiary's inaction. The most important factor that characterizes these constructions is 
volition. Consider the following examples. 


(10)  mug-wu thay-za-guin 
2SG-SUBJ go-happen-FUT 
*(S/he) goes, you will be affected.’ 


(11) ` mug-u k'ar-lag-za-gwun 
2SG-SUBJ run-take.away-happen-FUT 
*(S/he) runs away, you will be affected.’ 


Examples (10) and (11) express the passive sense. Here, t'ay and arian behave as the 
main verbs and -za is an auxiliary which provides the passive meaning to the construction. 
Here, the second person participants will be affected in an adverse manner if 'the not overtly 
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expressed' third person S argument of Han 'go' and k’arlay 'run away' goes away and run away 
respectively. 

The control of the subject of the passive-like construction on the action has an important 
role in passive-like constructions. The P argument either has full control or very low control 
over the action taking place. Let us look into some more examples. 


(12)  buraj-a gao-nwu an-zwum nu-za-duymun 
old man-SUBJ oneself-EMP 1SG-INST see-happen-REALIS.PST 
‘The old man make himself be seen by me.’ 


(13)  sikhao-a suk'idar-zwum nu-za-duy 
thief-SUBJ watchman-INST ` see-happen-PRES 
"The thief was seen by the watchman' 


The examples (12) and (13) are structurally the same: the predicate includes za, and the 
arguments are marked by the subject marker and the instrumental marker, respectively. 
Therefore, they both look like the passive like constructions which we have been discussing 
so far. What is different is the volition on the part of the patient, and this difference is 
additionally shown by the use of pronoun gao 'oneself in (12). The usage of gao 'oneself 
clarifies that the patient bura: ‘old man’ in (12) is willingly being seen or appeared in front of 
ay TD. In contrast, the patient sikhao ‘thief’ in (13) is inadvertently being seen. It might be due 
to his carelessness or over-confidence that he came across a policeman. In that sense, the 
patient sikhao ‘thief? still has control in that his stupidity or carelessness caused him to be 
seen, and he can be considered responsible for being seen even if inadvertently. These 
different ideas of control in (12) and (13) are thus based on context because both 
constructions are identical. 

Another interesting construction in Boro is the imperative passive-like construction. 
Where it is quite unnatural and dramatic to form imperative passive constructions in English, 
Boro uses this imperative passive-like construction like any other active imperative. It is 
probably due to the fact that English passives do not have any meaning of control whereas 
this type of za constructions in Boro involves volition. The following are some of the 
examples. 


(14)  ap'a-zug bam-za-du nuy-u 
father-INST carry-happen-IMP 2SG-SUBJ 
'Be carried by your father!' 


(15) ` mug-wu phurwug-za-du ` de abo-zwy 
2SG-SUBJ teach-happen-IMP REQ ` sister-INST 
'Be taught by your sister.' 

(16)  mug-wu t'ukwi-za-du abo-zwy 
2SG-SUBJ bathe-happen-IMP sister-INST 
"Be bathed by your sister.' 


The above examples (14), (15) and (16) express the interesting characteristic of the Boro 
imperative passive-like constructions, i.e. even as the undergoer of the event, the patient 
argument does have some control over the action. Following Boro's (in press) analysis, the 
patient participants are more central to the events in these za constructions, in that they enable 
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the event. This then leads us to another inherent factor of the Boro passive-like constructions, 
which is animacy. 

The whole idea of control of the patient becomes reasonable only when the patient 
argument is animate. Hence, animacy is an important factor to characterize the Boro passive- 
like constructions. As mentioned above, it is a necessity for the patient to be able to perceive 
the effect of the action. Therefore, it is compulsory to have an animate agent in a passive-like 
construction. The following are some unacceptable examples with inanimate patients. 


(17) *bizab-a an-zuy ` lir-za-dum 
book-SUBJ ISG-INST write-happen-REALIS 
Intended meaning: The book was written by me.’ 


(18)  *sinema-ja nai-za-duymun 
movie-SUBJ look-happen-REALIS.PST 
Intended meaning:‘The movie was seen by me.’ 


In (17) and (18), both bizab ‘book’ and sinema ‘movie’ are inanimate objects which 
cannot be affected in any way by the agent. Therefore, these constructions are ungrammatical 
in Boro. 

In the following example (19), hadwr ‘country’ is an inanimate object in many languages. 
However, in this example 'country' metonymically stands for the people of the country. (19) is 
a grammatically correct passive-like construction. 


(19) be hadur-a nepolian-zwy gaglub-za-duymun 
this country-SUBJ napolean-INST conquer-happen-REALIS.PST 
‘This country was conquered by Napoleon.’ 


We can also find some instances which express a passive sense without the agent being 
marked with the instrumental marker -zwy. Consider the following example. 


(20) bL hinzao k*arson-p"ui-za-duy 
3SG-SUBJ woman marry.forcefully-come-happen-REALIS 
‘A woman has come to his house to marry him against his will.’ 


Example (19) is similar in structure to (20) except for the absence of the instrumental 
marker, -zuy with the agent. This construction is used when the listener does not have any 
prior knowledge about the agent. Here, the agent hinzao ‘woman’ is unknown to the listener. 
Let us look at the next example. 


Q1) bi-ju hinzao-zwy k'arson-p'ui-za-duy 
3SG-SUBJ woman-INST marry.forcefully-come-happen-REALIS 
"The woman has come to his house to marry him against his will.’ 


Here, the speaker as well as the listener knows the girl and that it is the same girl who 
wanted to marry the patient of the construction. Therefore, the agent hinzao ‘woman’ is 
marked with the instrumental -ziwy. In both the sentences, the agent has come to the patient's 
house to marry him against his will. The only difference between the two constructions is that 
in (21), the agent is familiar and in (20), it is not. Examples (20) and (21) and the following 
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examples (22) and (23) show that unlike the English passive, the case marking of the agent 
with the instrumental marker -zuy is not required in some pragmatic and semantic conditions. 
The following examples are some better constructions which do not have agent marking. 


(22)  hinzao-a saront'ai gao-za-duymun 
woman-SUBJ lightning shoot-happen-REALIS.PST 
‘The woman was struck by lightning.’ 


(23)  bisur-w dui lum-za-zwub-bai 
3PL-SUBJ water  flood-happen-finish-PRF 
‘Their property is completely submerged by flood.’ 


In example (22), the agent saront'ai is not marked with the instrument marker -zui. 
Pragmatically, this construction carries a passive meaning. It is possible to form a hypothesis 
that in Boro, saront'ai is not semantically separate from the verb gao ‘shoot’. Hence, the 
meaning of the construction will actually be “the striking of lightning happened to the 
woman". 

Similarly, in (23), dui ‘water’ and lum ‘flood (verb)'is not semantically separated. The 
two words form a conjunct verb dui lum which would mean ‘flooding’. Boro does have a 
noun ‘flood’ which is bana but we don't say bana lum-za-zwb-bai. Hence, the construction 
would mean something like “the flood happened and the property was completely 
submerged." 


5. Conclusion 


The Boro za constructions have some of the general properties listed by many linguists to 
define passive-like constructions cross-linguistically. In this paper, we have compared the 
Boro passive-like constructions with the English passives. Having discussed the basic 
similarities and the dissimilarities between them based on morphosyntactic and 
pragmatic/semantic description, we have found four ways in which Boro passive-like 
constructions are different from the European languages. First of all, although Boro passive- 
like constructions are not always intransitive, there are constructions which express passive- 
like meaning even if the verb is intransitive. Second, Boro has imperative passive-like 
constructions too. In addition, the patient in a Boro passive-like construction has some kind of 
control over the event, either willingly or through inaction. Finally, Boro passive-like 
construction must have an animate patient. 


Abbreviations 

EMPH Emphatic marker 
FUT Future tense marker 
INST Instrument case marker 
IMP Imperative marker 
LOC Locative 

NF Non final marker 
OBJ Object 

PL Plural 

PRES Present tense 

PRF Perfect 

PST Past tense 
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REFL Reflexive 
REQ Request 
RES Result 

SG Singular 
SUBJ Subject 
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Abstract In order to refer to persons to whom an individual is related through kinship, every ethnic or linguistic 
group employs a set of kinship terminologies which forms an important feature of that group. The system 
of kinship is relatively more resistant to change than other linguistic features and thus provides 
invaluable information about the ethnic ties of the language group. The present paper is a discussion on 
the kinship terminologies in Hruso language in an attempt to look into its linguistic classification as a 
Tibeto-Burman language. Hruso (also known as Aka) is spoken by the Hruso tribe inhabiting in West 
Kameng district of Arunachal Pradesh, India. The language with a total population of 6000 has been 
listed as endangered by UNESCO’s report (2009). It belongs to the Hrusish group comprising of 
Hruso/Aka, Dhammai/Miji and Bangru/Levai of the Tibeto Burman languages (Burling, 2003:180). 
However, there are also differences in opinion regarding its classification as Tibeto Burman language. 
Linguists also believe it to be a language isolate or that it has more of Sino-Tibetan Language features 
than that of Tibeto Burman. The study of kinship terms tries to shed light on its genetic affiliation. In 
Hruso, despite some differences, the fundamental nature of the kinship system bears affinity with Proto- 
Tibeto-Burman (PTB) system of kinship terminology. In Hruso, most of the kinship reference / address 
terms have the prefix a- as in a-u ‘father’, a-ig ‘mother’, a-p'e ‘aunt’ and so on. Again, the kinship terms 
are also based on the gender distinction, where the feminine gender carries the suffix -m and the 
masculine gender remains unmarked. The Proto Tibetan kinship terms for gender are mostly those who 
are younger than the speaker. Like many Tibeto-Burman languages, in Hruso, the term for younger 
siblings is used irrespective of the gender distinction. The study reveals that apart from the existence of 
cross-cousin and matrilineal parallel-cousin marriages and the resultant pattern in kinship terminology 
having resemblance with other Tibeto-Burman languages, many of the terms can be seen as extension of 
the PTB forms. 
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1. Introduction 


The Hrusos are a small tribal group inhabiting the southern area of West Kameng district of 
Arunachal Pradesh in India. They are also known as Akas. The name “Aka” has been given to 
them by the people of Assam, which means ‘painted’. They were given this name because of 
their custom of face-painting (Pandey 2009). Hruso is the autonym. The natives also 
pronounce it as yusso. The Hruso speakers are mainly located at Jamiri, Thrizino and 
Bhalukpung of West Kameng district of the state. Hruso language belongs to the Hrusish sub- 
group comprising of Hruso/Aka, Dhammai/Miji and Bangru/Levai of the Tibeto-Burman 
languages (Burling 2003: 180). They are ethnically related and are expected to have possible 
linguistic similarities. In Abraham et al. (2005), it is mentioned that the Mijis and the Akas 
are closely related. They share many beliefs and customs and have a long history of inter- 
marriage. Even the name “Miji” was given by the Akas where mi means ‘fire’ and ji means 
‘living’. According to Shafer (1947), Hruso has been grouped linguistically with Mii of West 
Kameng and based on the linguistic similarities between these two languages; he even called 
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Hruso (Aka) as “Hruso A” and Dhammai (Miji) as *Hruso B". Hruso is also ethnically 
grouped with Koro, also known as Koro Aka or Miri-Aka in existing literature. Until recent 
past, Koro language, spoken in East Kameng district, was clubbed together with Hruso. 
However, Anderson and Murmu (2010) and Blench and Post (2014) in a brief comparison 
with Koro and Hruso show that the two have virtually nothing in common and that they are 
not two varieties of the same language. 

Hruso language has no script of its own and the language has existed through oral 
tradition. The Hruso language had 3,531 speakers in the 1991 census which increased to 5027 
in 2006 (Nimachow et al. 2011). It has been listed as endangered by UNESCO’s report 
(Moseley 2009). 

The history of the Hrusos gives us a glimpse of their long and closes association with 
Assam. Historically, the Hrusos believed themselves to be the descendents of Bhaluka, 
grandson of Ban Raja. In Hindu mythology, Ban Raja, also known as Banasura, once ruled 
central Assam with his kingdom in present day Tezpur (Gait 1906). Geographically, the 
Hruso area is bounded on the south by the Darrang district of Assam. According to some 
accounts, they also inhabited some parts of Assam like Missamari and Balipara in Sonitpur 
district of Assam. However, at the time of a serious epidemic disease, they fled to the hilly 
areas. The boundary line of the Hruso territory was demarcated in 1874-75 (Das 1989: 35). 
Bhalukpong which is marked as the boundary between the two states, viz, Arunachal Pradesh 
and Assam is claimed by the Hrusos to be the kingdom of their ancestor Bhaluka after whom 
the place is named so. 

Even though Hruso is grouped as one of the Tibeto-Burman languages there are 
differences in opinion regarding its phylogenetic position. In many existing classifications of 
this language family, Hruso has been grouped under ‘Kamarupan’ or languages of ‘North 
Assam' sub-group of Tibeto-Burman languages, based on geographical location (Burling 
1999). Some linguists believe it to be a language isolate or that it has more of Sinitic features 
than that of Tibeto-Burman (Blench and Post 2014). In his seminal work Linguistic Survey of 
India, Grierson (1909) also mentioned that *Aka has a peculiar appearance and it is often 
difficult to compare its vocabulary with that of other Tibeto-Burman forms of speech". 


1.1. Objectives 


In the light of the above controversy regarding the genetic affiliation of Hruso with Proto- 
Tibeto-Burman (PTB) language, the present paper is an attempt to look into the kinship terms 
in the language, in anticipation of finding some clue. 


2. Hruso Kinship terminology: a descriptive analysis 


Kinship relations are a set of interacting roles attributed to various statuses to kinsmen and 
every ethnic or linguistic group has a set of words that symbolize each status. In other words, 
kinship terminology refers to the various systems used in languages to refer to the persons to 
whom an individual is related through kinship. Kinship terms form an important feature of 
any ethnic or linguistic group as they give us insight into the origin, culture and genealogy of 
a particular linguistic community. The reason behind is probably the fact that the kinship 
system is as old as the family system in a community and is mostly preserved by the 
community through ages. The idea that kinship systems are more resistant to change has been 
put by Morgan (1859) in the following lines: 


"Language changes its vocabulary, not only, but also modifies its grammatical structure in the progress of 
ages; thus eluding the inquiries which philologists have pressed it to answer; but a system of relationships 
once matured, and brought into operation, is, in the nature of things, more unchangeable than language — not 
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in the names employed as a vocabulary of relationships, for these are mutable, but in the ideas which 
underlie the system itself." 


Going by the above definitions the structure of Hruso kinship system in terms of which 
relations are coded by which terms and the system of marriage, as to who can marry whom, is 
expected to be similar to the structure of the reconstructed Tibeto-Burman system. On the 
other hand, the terminologies used to refer to these relations, as Morgan puts it, have the 
tendency to change or are mutable and thus the Hruso kin terms might have changed 
considerably in due course of time. In this paper an attempt has been taken to refer to the PTB 
roots of the kin terms in order to find out Hruso's affinity towards PTB. The system of 
kinship terminology in Hruso has its own ethnic value. If we go by the existing classification 
that Hruso belongs to a sub-group of greater Tibeto-Burman family, the kinship 
nomenclatures are expected to bear similarities with that of other Tibeto-Burman languages. 
In the following sub-sections a descriptive analysis of Hruso kinship system brings forth its 
crucial features 

Hruso kinship nomenclatures consist of both consanguineal and affinal relations. The 
parent terms are a-iy ‘mother’ and a-u ‘father’. In case of relations elder to ego, the address 
and reference terms are normally the same. Those younger to him or her are addressed by 
their names. Hruso people only have terms for their immediate grandparents namely, ain- 
mohrom ‘grandmother’ and ao-mofhro ‘grandfather’ literally meaning ‘mother-old woman’ 
and ‘father-old man’, respectively. Terms of the 3" and the 4® ascending generations i.e., the 
terms for great grandfather or great grandfather’s father and their corresponding feminine 
gender are non-existent in the language. Hruso also does not have terms for grandchildren. 
They normally refer to them as no so i so/sam '1sg.poss son poss son/daughter’ ‘my son's 
son/daughter’. 


2.1. Prefix a- 


In Hruso, most of the kinship reference/ address terms have the prefix a- as in a-v ‘father’, 
a-iy ‘mother’, a-p^e-ho ‘uncle’, a-p"e ‘aunt’ and so on. The parent terms in Hruso possess TB 
feature of having the prefixed a- in a-iy ‘mother’? and a-v ‘father’ but vary in terms of the 
roots -iy and -u which act as the sex modifiers. The kinship terms with prefix a- in the 
language are listed in Table 1 below. 

In Hruso the reference and address terms (with prefix a-) of kinsman in most cases are 
similar, while younger brother, younger sister, nephew and niece are generally addressed by 
their names. For instance, the term aig ‘mother’ is referred to as i aig in order to mean ‘his 
mother’, no aig ‘my mother’ or ba air ‘your mother’. In other words the prefix a- is retained 
even in possessive implication. There is no morphophonemic change taking place. However, 
in case of affinal kin terms, for instance, /a/i ‘husband’, there will be a loss of the central 
vowel /o/ in il/i in order to mean ‘her husband’. 

The a- prefix is found in Hruso vocabulary with some adverbs like a-ge ‘here’ (the-ge 
‘there’), a-jo ‘hence’ and a-pa ‘in this way’ (siia ‘in that way’), but this prefix is not used 
with other nominals. 


7 Similar to Assamese ai ‘mother’. 
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Table 1: Kinship terms in Hruso with prefix a- 


Words Gloss 

a-in ‘mother’ 

a-U ‘father’ 

a-in-mohrom ‘grandmother’ 

a-o-mohro ‘grandfather’ 

a-p'e-ho [spouse- a-p"e] ‘maternal uncle (elder)' 
a-sa [spouse=a-na] ‘maternal uncle (younger)’ 
a-p'e ‘maternal aunt(elder)’ 
dea ‘maternal aunt (younger)’ 
a-k'i [spouse- a-p'e] ‘paternal uncle (elder)’ 
a-ja__[spouse=a-na] ‘paternal uncle (younger)’ 
a-t'r9 ‘paternal aunt (elder)’ 
a-ma ‘paternal aunt’ (younger)’ 
a-ija “brother (elder)’ 

a-k^o ‘brother (younger) 

a-ma ‘sister (elder)’ 

a-k^o ‘sister (younger)’ 

a-md/ a-p'e ‘sister-in-law (elder)’ 
a-p'e-ho *bother-in-law (elder)’ 
a-na ‘elder brother’s wife’ 

a-ja ‘elder sister's husband’ 
a-p'e-ho-ao ‘father-in-law’ 


2.2. Kinship terms based on the gender distinction 


The gender based distinction in kinship terminology is also evident in Hruso. The language 
has two sets of gender markers for humans and animals’. In case of humans the masculine 
gender remains unmarked (O). In feminine counterparts only the initial part/phoneme(s) of 
the masculine term is retained, while the following vowel sounds normally undergo morpho- 
phoemic changes when the feminine gender marker -m is suffixed at the end. Table 2 
illustrates few examples of gender distinction in case of human nominals. 


Table 2: Example illustrating gender marking in human nominals in Hruso 


Ø (male) Gloss -m (female) Gloss 
moho ‘man’ mi-m ‘woman’ 
droa ‘friend’ nadra-m ‘girlfriend’ 
njo ‘brother’ ni-m ‘sister’ 
nogo ‘king’ nogo-m *queen' 


Except for a few terms like father-mother, husband-wife, uncle-aunt, elder brother-elder 
sister, the feminine gender marking with the -m is found in most of Hruso kinship terms. The 
kinship terms with the suffixed sex modifier -m are listed in Table 3. 


3 The language uses a different set of gender marker for animals. They are umbo for the ‘male’ and imni for the 
‘female’. 
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Hruso Gloss 
so ‘son’ 
sa-m ‘daughter’ 
nju ‘brother’ 
ni-m ‘sister’ 
bosao ‘brother/sister’s son’ 
basa-m ‘brother/ sister's daughter’ 
drafodro ‘son’s/daughter’s father-in-law’ 
drafodra-m *son's/daughter' s mother-in-law’ 
Iwo “brother-in-law (younger) 
lwa-m ‘sister-in-law (younger)’ 
aomohro ‘grandfather’ 
ainmohro-m ‘grandmother’ 


2.3. Kinship terms for neutral gender 


In Hruso, the term ak’o is used for younger siblings or young ones in the family irrespective 
of the gender distinction. Apart from a'o there is another term that is used for neutral gender. 
In Hruso, mother’s younger brother and sister are referred to as a neutral term asa. Both the 
terms are used to refer to younger ones in the family. 


2.4. Affinal Kinship 


Marriages with patrilineal parallel cousins are considered a taboo or violation of the law of 
exogamy. In the words of Prakash (2007: 1087) ‘like other groups of people living in close 
society in the days of the yore, the Akas are also distinguished by the dual principle of clan 
exogamy and tribe endogamy.’ Marriages between Hruso and Miji tribes are very common, 
however, so far as marriages amongst these clans are concerned, it has been observed that the 
Hrusos practice exogamy^. For instance, the Sasusow, Regisow and Debisow of Sukring 
village, situated at a distance of 2km from Thrizino in West Kameng district of Arunachal 
Pradesh practice exogamy. They consider themselves belonging to same lineage/clan and 
hence prohibit marriage amongst themselves. Again, inter-marriages are possible amongst 
Dususow, Gebisow, Kabisow, Sechesow, Sarpunsow and Sorisow (of Khuppi and Buragaon 
villages of Jamiri, West Kameng) as they are considered as different clans. 

Hruso has a patriarchal society or patriarchal lineage, which is also evident in their clan 
names that end with -sow /so/ meaning 'son', as in Dususow, Gebisow and so on. Polygamy 
is prevalent and acceptable in the community, where a man may have more than one wife, 
while polyandry or the system of having more than one husband does not exist in the 
community. Apart from the term /a/i for ‘husband’ and fom for ‘wife’ the other affinal terms 
are illustrated in Table 4. 


^ Exogamy means when some community prohibits marriage between individuals sharing certain degrees of 
blood or affinal relation. And when some communities prohibit marriage outside the caste or clan, then it is 
known as endogamy. 
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Table 4: Affinal Kinship terms in Hruso 


Hruso Gloss 

ap'e-ho-ao / bofu ‘father-in-law’ 

ap'e air / nisi ‘mother-in-law’ 

homo ‘son-in-law’ 

safo ‘daughter-in-law’ 

drafodro *son's/daughter's father-in-law’ 
drafodram *son's/daughter's mother-in-law’ 
lwo “brother-in-law (younger)’ 

lwam ‘sister-in-law (younger)’ 
ama/nisi ‘sister-in-law (elder)’ 


These terms normally do not have the prefix a-. In Simon (1993), Anderson (1896), the 
reference terms for in-laws in Hruso are bofo for ‘father-in-law’ and nisi for ‘mother-in-law’. 
However, native speakers in the present time refer to in-laws as ap"e-huv-au and ap"e-aiy 
which bears similarity with the terms for elder sister-in-law ap^e and elder brother-in-law 
ap"e-ho, except for the fact that -av and -aiy is suffixed to them. According to Simon (1993), 
father-in-law and elder brother-in-law share the same terminology in the language. Likewise, 
the same terminology is shared by mother-in-law and elder sister-in-law. These are the very 
terms that are used for maternal uncle and aunt. The fact that they share equal status is also 
evident in the use of one term. The reason behind having similar terminology for maternal 
uncle and aunt and in-laws is the prevalence of cross-cousin marriages that are considered 
legitimate in Hruso community. In this community, one can marry his or her cross-cousin, 
offspring of a maternal uncle, since that person does not belong to same family lineage. 
Hrusos, however, practice a unilateral cross-cousin marriage where a man may marry his 
mother's brother's daughter but not his father's sister's daughter. Besides, in this community, 
matrilineal parallel-cousin marriages are also allowed where a man may marry his mother's 
sister's daughter, as they are from different lineages. In Hruso, a man may even marry his 
mother's younger sister. But he cannot marry his sister's daughter or niece. In Hruso, as a 
result of cross-cousin marriages, ego's maternal uncle or mother's brother becomes his father- 
in-law and his mother's brother's son becomes his brother-in-law and this is reflected in their 
kinship nomenclature as well. The only difference is that, the moment these relations become 
affinal they receive the additional suffix -av *father'and am ‘mother’. It is interesting to note 
that the pair of ap^e-ho and apte is the only instance in Hruso where the masculine gender is 
marked and the feminine remains unmarked. 


3. Kinship system in PTB 


Over the years linguists like Shafer, Benedict and Matisoff, and many others have attempted 
to reconstruct the Proto Tibeto-Burman (PTB) roots which will help in categorizing all the TB 
languages under one phylum. The descriptive account of the Hruso kin terms brings forth 
interesting similarities with the kinship system in Classical Tibetan or Written Tibetan 
discussed in (Benedict 1942) and bears certain affinity towards many reconstructed PTB 
roots. The PTB roots in this study are cited from Matisoff's Sino-Tinetan Etymological 
Dictionary and Thesarus (STEDT) database. To begin with, most of the consanguineal kin 
terms in Hruso possess the prefixed a- which is in fact “characteristic of the Tibeto-Burman 
languages as a whole and can be traced also in Chinese” (Benedict 1942). The regular parent 
terms in Written Tibetan (WT) are a-ma ‘mother’ and a-pha ‘father from the universally 
extended PTB roots *ma and *p’a and the most commonly used grandparent terms are 
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phyi-mo ‘grandmother’ and mes-po ‘grandfather’ (Benedict 1942: 316)’. Here the meaning of 
mes is ancestor (Nagano 1994: 105). If we consider other Tibeto-Burman terms for 'father' 
like awa in Koch or ava in Tangsa and Tangkhul and ao in Hruso we can imagine possible 
change from PTB root p 'a or pha that might have resulted in due course of time as p^ > b > w 
>u. Again, the mother term in Hruso a-iy can be seen similar to an or in in Tikhak, a variety 
of Tangsa group belonging to Tibeto-Burman language family. The former means 'mother' 
without the possessive implication, while the latter means 'mother' with the possessive 
implication 'my mother' (Parker, forthcoming). Besides, the terms for 'mother' anuih in 
Dhammai and anye in Bangru also have the nasal consonant (see Table 6). 

Hruso only has terms for the immediate grandparents. This is in keeping with the normal 
Tibetan phenomenon where terms for the 3™ and the 4" ascending generations are not 
common (Benedict 1942: 315). Benedict states that the grandparent terms phyi-mo and mes- 
po lack cognates in other Tibeto-Burman languages and are probably based on the sex- 
modifiers -mo and -po, respectively. This is also true for Hruso where the grandparent terms 
are based on the parent terms in combination with the word for old man and old women as in 
aig-mofirom ‘grandmother’ and ao-mohro ‘grandfather’. In WT, sex distinctions are made 
through combination of the sex modifiers mo and bo, for instance, nu-mo means ‘younger 
sister' and nu-bo means 'younger brother' (Benedict 1942). The PTB roots for male and 
female are *p/bV and *mV, respectively. In Hruso, however, even though the feminine gender 
marker -m here has affinity with the WT mo sex modifier or the PTB root *mi for 
*girl/female', but unlike other Tibeto-Burman languages, it does not mark the male nominals. 
In Apatani, there are separate gender markers ni-mu ‘female’ and mi-lo-bo ‘male’ that bear 
affinity to PTB roots (STEDT database). 

The Hruso term aK^o ‘younger sibling’ corresponds to the PTB root *ko: for neutral 
gender. The Proto Tibetan kinship terms for neutral gender are mostly those who are younger 
than ego. It finds clear cognates with many other Tibeto-Burman languages like ko or ku in 
Tani languages. 

Hruso affinal nomenclature is not extensive and this holds true in case of Tibetan as well. 
If we look at the Hruso terms bofo 'father-in-law'and nisi ‘mother-in-law’ we observe that the 
former corresponds to the PTB form *bo for ‘father’ while the latter is extended from *ney x 
*ni(y). The proto-form for brother-in-law does not exist whereas; the proto-form for sister-in- 
law is *mow and Suen. The term /wam can be said have retained the gender based suffix from 
the proto-form. Similarly, there is no proto-form for in-laws of ego e son/daughter. The proto- 
form *krway does not find extension in the Hruso terms humo ‘son-in-law’ or safo ‘daughter- 
in-law’, though safo ‘daughter-in-law’ can have possible affinity towards the form *s-nam 
*daughter-in-law/wife/sister'. 

As has been mentioned earlier, cross-cousin marriages are considered legitimate in Hruso 
community, which is also claimed to be ‘a conspicuous feature in both Tibetan and Chinese 
cultures’ (Benedict 1942: 337). Hrusos practice a unilateral type of cross-cousin marriage 
which is also prevalent in Kuki language, belonging to Tibeto Burman language family 
(Benedict 1942: 326). In contrast, the Noctes? of Arunachal Pradesh, allow a bilateral cross- 
cousin marriage where a man may marry both his mother's brother's daughter and also his 
father’s sister’s daughter. It is noteworthy that two languages of the Hrusish group, namely, 
Dhammai and Bangru have the feature of sharing the same terminology for father-in-law and 
grandfather. The term a/ou in Dhammai and a/o in Bangru are used for both father-in-law 
and grandfather. However, in Hruso this phenomenon is not seen. In fact, it has distinct kin 


5 Noctes are one of the tribes of Arunachal Pradesh inhabiting mainly in Changlang and Tirap district in the 
eastern most part. Their language belongs to the Tibeto-Burman language family. 

$ Source: Researcher has collected this information from Ms.Trisha Wangno, a native speaker of Nocte hailed 
from Changlang District of Arunachal Pradesh. 
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terms for both. This is illustrated in Table 6 in section 4. In this context, we can refer to 
Benedict (1942: 327) where he points out that in Tibetan, with the advent of teknonymy, a 
man addresses his in-laws by the terms employed by his own child, father-in-law is called 
‘grandfather’ and ego’s mother's brother can be equated with that of grandfather. This is also 
seen in Hill Miri, a Tibeto-Burman Tani language where a-to is used for both grandfather and 
father-in-law (STEDT database). In this regard Hruso is quite distinct from what is typical to 
Tibetan or other Tibeto-Burman languages. Nonetheless, equating mother's brother's child 
with mother's brother is prevalent in Hruso, which can be seen as an extension to the use of 
equating mother's brother with grandfather. In other words, elder maternal uncle and his son 
have equal status and share same terminology ap^e-ho. 

Coming to affinal kin terms, in PTB, the terms for in-laws and that of maternal uncle and 
aunt are mostly the same and this is true in case of Hruso, even though the Hruso terms may 
not be extensions of the PTB roots. Here, we find Morgan's definition very apt that says that 
vocabulary does change but structure of the system remains the same. Hruso structure of 
kinship terminology resembles much of the PTB structure. The WT term for father-in-law is 
gyos-po and the one for ‘mother-in-law’ is sgyug/gyos-mo. Benedict (1942: 322), 
nevertheless, agrees to the fact that ‘neither gyos nor sgyug has any extension in other TB 
languages’. The PTB term for father-in-law is *to and that of mother-in-law/aunt is *ney x 
*ni(y). The former finds extension in d-to ‘father-in-law’ in Apatani, and a-to in Hill Miri, 
two of the TB languages belonging to Tani group in Arunachal Pradesh. The latter finds 
extensions in many TB languages like ma-ni in Garo, ni in Karbi and so on (STEDT 
database). In contrast, the Hruso terms ap^e-ho and apte are not possible extension of the PTB 
roots *to and *ney X *ni(y), respectively. 

Another point where Hruso kinship terminology differs from other Tibeto-Burman 
language groups is in case of reference terms for parents’ siblings. The terms for parallel 
uncle and aunt are normally formed by combining elements chun ‘small’ and chen ‘large’ in 
many Tibetan dialects and this is also observed in other TB languages like Lepcha a-bo-tim 
and a-bo-tsum, Jingpaw wa-di and wa-doi, Burmese p’a-kri and pa t'we, these pairs of kin 
terms refer to ‘father’s older brother’ and ‘father’s younger brother’ as ‘big father’ and ‘small 
father’, respectively (Benedict 1942: 316). Even in Nocte language spoken in Changlang 
district of Arunachal Pradesh, the terms for ‘father’s older brother’ and ‘father’s younger 
brother’ are called pa-doy and pa-di where pa is ‘father’ and doy and di means large and 
small respectively’. In Hruso, however, such combinations are not available. Instead, it has 
separate terms for father's older brother and for father's younger brother, namely, ak’i and 
aja, respectively as has been illustrated in Table 5. The latter shares the same terminology 
with elder brother in the language. 


Table 5: Kinship terms for Maternal/Paternal uncle and aunt in Hruso 


Hruso Gloss 

ap'e-ho [spouse =ap"e] *Maternal uncle (elder) 
asa [spouse=ana] ‘Maternal uncle (younger) 

apte [spouse-ap"e-ho] *Maternal aunt (elder) 
asa [spouse-aija] ‘Maternal aunt (younger) 
ak'i [spouse=ap'e] *Paternal uncle (elder) 
aja [spouse=ana] *Paternal uncle (younger) 
atro [spouse=ak"i] ‘Paternal aunt (elder) 
ama [spouse=aija] ‘Paternal aunt (younger) 


7 Source: Same as footnote 6. 
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3. Comparison with Hruso, Dhammai (Miji) and Bangru (Levai) kinship terms 


As has been mentioned earlier, Hruso along with Dhammai (Miji) and Bangru (Levai) 
constitute the Hrusish sub-group within Tibeto-Burman language family (Shafer 1947 and 
Burling 2003). Hrusos and Dhammais inhabit the Kameng districts while Bangrus live in Sarli 
circle of Kurung Kumey district (erstwhile Lower Subansiri) of Arunachal Pradesh. Thus, 
geographically, Bangrus are not at proximity with Hrusos like Dhammais. However, they 
exhibit interesting linguistic similarities as a result of which they form a sub-group. Ramya 
(2012) also writes about these three communities having a common origin as per popular 
Bangru beliefs. In his account, the Bangrus prefer to call themselves *Taju-Bangru' and they 
believed Hrusos and Dhammais to be of same descendants which come under *Wadu-Bangru' 
branch. 


Table 6: Comparing Hruso with Dhammai (Miji) and Bangru (Levai) and PTB kinship terms. 


PTB Hruso Dhammai Bangru Gloss 
*ma ain anuih anye ‘Mother’ 
*p*/ba, * bo ao abo, bu miibi *Father' 
*(y)ay aimohrom azhui asse ‘Grandmother’ 
*ta:y aomohro alou alo *Grandfather' 
KAN ap^e-ho akhiw kiini *Maternal uncle 
(elder) 
*yay X *Pay, 
*ney 3€*ni(y) ap"e acho achowa *Maternal 
aunt(elder)’ 
- aki avang - *Paternal uncle 
(elder) 
*ni(y) atro (elder) amo mesebya *Paternal aunt 
ama (younger) (elder) 
*tay,* Pik, * aija aku-vo/mukhuvo ako *brother (elder)' 
buy (elder) miniw 
(younger) 
*me ama amu, amo mesebya “sister (elder)’ 
*bwr-mo: ak'o neh *sister(younger)' 
*owa^ *wa, lofi dighai melgya ‘Husband’ 
*pa-*sal, *mi-lo 
*s-nam, *hma^ | fum zhi mii ‘Wife’ 
m 
*fo ap"e-ho alou alo ‘father-in-law’ 
*ney XxX *ni(y) | apte azhui asse ‘mother-in-law’ 
- bosao - dguchobya *brother/sister's 
son' 
- bosaum - dguchobii *brother/sister's 
daughter’ 
*la(:), * Qaa sam zeh, zumraih mudu- ‘daughter’ 
nyiwai 
*nu, *bu, *ko: ak'o amai ‘child’ 
*tsa-n X Fan | so zu mudgu-nyüb | ‘son’ 
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In Table 6, an attempt has been made to bring together some of the kinship terms in these 
languages. Hruso kinship terms are being compared with Dhammai and Bangru? terms in 
order to assess the degree of similarities and differences among them and also to assess their 
correspondence with the reconstructed Proto-forms cited from STEDT database. 

Table 6 tries to illustrate how each of these languages, claimed to a subgroup within the 
greater Tibeto-Burman family, differs from each other or form cognates and also how closely 
each of them resembles the proto-forms given in the first column at the left of the Table. In 
keeping with Morgan's definition (1859) about how vocabularies or naming may change but 
the underlying structure of a kinship system is more resistant to change, Table 6 brings forth 
certain prominent PTB features retained in these languages. Nevertheless, it is noteworthy 
that though Hruso, Dhammai (Miji) and Bangru (Levai) are grouped together as Hrusish, they 
exhibit interesting points of dissimilarities as shown in the Table. The PTB feature of the 
kinship terms with prefix a- is evident in most of the basic terms beginning with the parent 
terms. However, there are exceptions to this general rule in Bangru miibi ‘father’. But in 
terms of finding cognates across these languages, it appears that Dhammai abo and Bangru 
miibi resemble the PTB root *bV for ‘father’ while, the Hruso term au does not. Another 
possibility could be the loss of the consonant sound /b/ in Hruso in due course of time which 
is, on the other hand, retained by the other two members of the group. The terms for 'mother' 
a-iy in Hruso, nonetheless, find clear cognates in anuih in Dhammai and anye in Bangru 
evident from the presence of a nasal consonant. The terms used for grandparents like alou and 
azuih in Dhammai and a/o and asse in Bangru are also used for in-laws in both the languages 
This PTB feature also known as teknonymy, where a man addresses his in-laws by the terms 
employed by his own child , is not seen in Hruso. If we look at Table 6 we find that the PTB 
root *(yJay is for ‘mother-in-law/grandmother/maternal aunt’. In other words, mother-in-law 
and maternal aunt share the same terminology due to cross-cousin marriage, and mother-in- 
law and grandmother share the same terminology because of teknonymy. However, unlike 
Dhammai and Bangru, Hruso does not have the PTB feature of teknonymy. Nonetheless, it 
has retained the feature of cross-cousin marriage. The term for paternal aunt and for elder 
sister remains the same in each of these languages for instance, ama in Hruso, amo in 
Dhammai and mesebya in Bangru. These terms are extensions of PTB root *me. Table 6 
shows that unlike Dhammai and Bangru, in Hruso there are separate terms for father's elder 
and younger sister and the term used for father's younger sister ama is used to refer to ego's 
elder sister. Similarly, the term for father's younger brother and that of elder brother are same 
in the language. 

As is evident from Table 6, the term a-K"i for paternal uncle can be seen as an extension of 
PTB root *k’u (also called khu in WT). Likewise, the term for son, so in Hruso, zu in 
Dhammai and mudgu-nyiib in Bangru can be said to have extended from PTB *tsa-n Xx *za-n 
as is evident from the presence of fricative and affricate sounds in each of these terms. The 
term sam 'daughter' in Hruso may not directly correspond to the PTB form given in Table 6 
but it bears affinity to *s-nam which stands for daughter-in-law, wife or sister in PTB. The 
Bangru term for husband me/gya can be taken as an extension of PTB *mi-lo whereas, 
Dhammai term dighai for husband has affinity towards *gwa*. The Table shows that Bangru 
and Dhammai form cognates whereas Hruso differs in many of the kinship terminologies. 


* The Dhimmai (Miji) data have been cited from Miji Language Guide by Simon (n.d.) and the Bangru data have 
been cited from (Ramya, 2012) 
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4. Observation 


The investigation into the kinship terminologies in Hruso has put forth some interesting facts 
regarding its classification as a Tibeto-Burman language. The presence of PTB features is 
conspicuous in Hruso kinship terminologies. Most of the kinship reference/address terms have 
the prefix a- as in a-u ‘father’, a-iy ‘mother’, a-p^e-ho ‘uncle’, a-p^e ‘aunt’ and so on. Sex 
distinction is made through suffixing the sex modifiers -m for feminine gender which 
corresponds with the *mV root in PTB. The neutral gender term a-k’o is used for younger 
siblings or young ones in the family. Besides, unilateral cross-cousin marriage resulting in the 
use of same term for maternal uncle and father-in-law is also prevalent in the language. 
Matrilineal parallel cousin marriages are allowed resulting in sharing the same terminology. 
The presence of only the immediate grandparent terms and the fact that they are based on the 
parent terms in combination with the words old man and old woman, indicate much of the 
proto system. The parent terms in the language a-iy and a-u find similarities with other 
Tibeto-Burman languages, even though a-iy do not directly corresponds to PTB a-ma. 

Nevertheless, along with the similarities some differences from PTB have also been 
observed in the analysis. Many of the terms do not form clear cognates with the reconstructed 
PTB roots. Further the gender marking in Hruso is unlike other Tibeto-Burman languages. As 
has been mentioned earlier, unlike other Tibeto-Burman languages, there are separate terms 
for father's elder and younger sister in Hruso and the terms do not need to be combined with 
elements like small and large. Again, the PTB feature of sharing the same terminology for 
father-in-law and grandfather is missing in Hruso. In other words, Hruso differs from 
Dhammai and Bangru which have more apparent cognates as is evident from the comparative 
analysis. The above observations may hint towards Hruso being claimed as a language isolate 
or towards the fact that the nomenclature of the Hrusish sub-group needs to be reconsidered. 
The paper raises certain questions as to whether Dhammai, Bangru and Hruso form a sub- 
group or that only the first two forms a sub-group while Hruso is secluded. Despite the 
interesting deviations Hruso kinship system shows more inclination towards Tibeto-Burman 
features and thus other linguistic aspects need to be looked at in order to establish its genetic 
affiliation. 


Abbreviation 


n.d. No Date 

PTB  Proto-Tibeto Burman 
TB Tibeto Burman 

WT Written Tibetan 
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Abstract Bugun, Deori and Nocte are spoken in the Northeastern states of Arunachal Pradesh and Assam. Deuri 
and Nocte belong to the Bodo-Konyak- Jingphaw group. Bugun is grouped with the Kho-Bwa cluster of 
languages comprising Sherdukpen, Lish and Puroik (van Driem: 2007a) though not all linguists follow 
this classification. This paper is an attempt to show the similarities and variations in the counting system 
of these languages. Mazaudon (2010), states that there are a range of different systems (see page 2 
section 2). The Tibeto-Burman languages in this paper consist of decimal numbering system as is seen in 
Bugun and Nocte and a vegisimal system in Deuri. 
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1. Introduction 


“Numbers have been sometimes considered as part of the core vocabulary, that part of the 
vocabulary which remains the most stable over time." (Mazaudon 2010: 118). In other words 
numbers can be one of the tools to help identify and classify languages. However contact with 
more prestigious languages do lead to borrowing with the result that some of the features of 
the numeral system of the lesser known and endangered languages are lost. 

This paper examines the numeral system of three Tibeto-Burman languages namely 
Bugun, Deun and Nocte. Deur belongs to the Bodo-Koch sub-group (Burling 2003, 
(Jacquesson 2005). It is spoken in Lakhimpur, Dhemaji, Dibrugarh, Jorhat, Tinsukia, Sonitpur 
and Sivsagar districts of Assam. In the Census Report of 2001 the total population of Deuri in 
Assam is 41,161 and the number of active speakers of Deuri is 27,960. Nocte falls under the 
Konyak subgroup of Boro-Konyak-Jingphaw! group (Burling 2003) and is spoken by 35,000 
speakers in Tirap and Changlang districts of Arunachal Pradesh. The Nocte data is a mix of 
the Dadom? and the Hakhun variety. These two varieties were listed by Parul Dutta (1978). 
Bugun is spoken in the West Kameng district of Arunachal Pradesh. As per the 2011 Census 
Report the total population of Bugun is 1432. Bugun is imminently endangered on two 
counts: firstly the number of active speakers is low, and secondly most of the young people, 
especially the educated ones, have shifted to both Hindi and Nepali for everyday 
communication. Bugun is yet to be classified into one of Tibeto-Burman sub-groups. Van 
Driem (19972) assumes that Sherdukpen-Bugun-Sulung (Puroik) may belong to the Kho-Bwa 
language cluster, whereas Blench and Post (2014) tentatively calls them Kamengic. 

The focus of this study is to: 


i. investigate the numeral system of Bugun, Nocte and Deuri, 
ii. find how the base numeral build compound numerals in these languages , and 
iii. to what extent number building system vary in these languages, 


! Benedict (1975) places Boro-Konyak-Jingphaw into one subgroup. 
? Nowadays, native speakers of Nocte tend to blend both the Dadom and the Hakhun varieties. 
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2. Cardinal Numerals in Bugun?, Deuri and Nocte: 


Mazaudon (2010: 117) states that while there are a range of different numeral systems, the 
vast majority of Tibeto-Burman languages have decimal numeral systems. Of the three 
Tibeto-Burman languages discussed in this paper Bugun and Nocte have decimal numeral 
systems but with marked differences in their numeral building system. Deuri has a vigesimal 
system. Table 1 shows the core cardinal numerals of the three languages. 


Table 1 — Core Cardinals in Bugun, Deuri and Nocte‘ 


Gloss Bugun Deuri Nocte 

One ió / dó muza want'é 

Two yin muhuni wanni 

Three im muņda wanram 

Four wi mudffi (wan)bali 
Five gua mumuwa (wan)bagá 
Six rab muffu (wan)ro‘/aro? 
Seven mılījá mufiy (wan)yid nid" 
Eight miljá mulffe (wan)sat /asat 
Nine digé mudugu (wan)k^i/ iki 
Ten sud muduga (wan) k'i/ ifft 


Of the three languages, Bugun does not take a prefix to build its cardinal numbers; 
whereas Deuri and Nocte have prefixes to form the core numeral as seen in Table 1. In Deuri 
the prefix mu- is obligatory whereas in Nocte the prefix wan- can be dropped. The prefix 
wan- is the full form, however /a/ becomes /a/ in connected speech or while reciting the 
numerals. From cardinal four onwards wan is obligatorily dropped, however in some Nocte 
varieties it is retained as 1s indicated by the parenthesis. In numbers 6, 7, & 8 the prefix wan 1s 
replaced by /a/ and by /i/ in numbers 9 & 10. The cardinal numerals in Bugun from one to ten, 
do not take a prefix to build up the base numbers. In Deuri the base numerals shown in Table 
2 take the mu- prefix to form the core numerals in Table 1. The cardinal numeral fa and za 
"one" in Deuri are variants, native speakers alternately use both the forms. 


Table 2 — Deuri base numerals ‘one’ through ‘ten’ 


Deuri Gloss | Deuri | Gloss 
a/za One u Six 
kuni/kini/huni/hini Two | fin Seven 
nda Three | ffe Eight 
ffi Four | dugu Nine 
muwa Five duga Ten 


7 The Bugun numeral data was collected from Singchung, Wanghoo and New Kaspi during field trips to West 
Kameng for the UGC major research project 2009-2011 by Madhumita Barbora. The Deuri data was collected 
by Prarthana Acharya from Narayanpur, Lakhimpur district, Assam. Kishore Deori and Gaurav Deori were the 
informants. The Nocte data was collected by Trisha Wangno, a native speaker of Nocte, from Bordumsa, 
Changlang district, Arunachal Pradesh. 

^Native speakers mentioned that both fa and za are used by them for ‘one’ 

7 [n this paper the tone marked on the Nocte data is based on Praat analysis of the recorded data collected during 
fieldwork by Trisha Wangno in Bordumsa, Changlang district of Arunachal Pradesh. In Weidert (1987) the 
Nocte words have tone number 2 for ‘one’, tone number 3 for ‘two and ‘four’ and tone number 1 for ‘ten’. We 
follow Trisha's analysis in this paper. 
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The Nocte base numerals one to ten shown in Table 3 combine with the prefix wan-, to 
form the core numerals as shown in Table 1. 


Table 3 — Base numerals in Nocte 


Nocte Gloss Nocte Gloss 
té One arók Six 

nt Two and" Seven 
ram Three asat Eight 
belí Four ik^i Nine 
Dand Five iff'í Ten 


The occurrence of a prefix in the numeral system of a Tibeto-Burman language is not 
obligatory. 

From Table 1 we observe that Bugun numerals have three tones high, mid and low. It 
must be noted that tone in Bugun is disappearing mainly due to the impact of languages like 
Hindi, Nepali, Assamese and others. Nocte? numerals show mid and high tone. Deuri 
cardinals do not show tone. Jacquesson (2005) remarks "tonal opposition is dying in Deuri, it 
certainly was prosperous although it is difficult to locate chronologically". 


2.1. The cardinals twenty and hundred 


Besides the basic numerals in Table 1, Bugun has two other basic numerals it/"ak ‘twenty’ 
and wiam dog ‘one hundred’. In § 2.2.1 we look into the it/*ak numeral in detail. Deuri has a 
prefix kuwa- which helps in constructing higher digits twenty onwards in the language. See 
§2.2.2 for details. Nocte has the prefix ruwok- ‘ten’ which builds the numerals twenty to 
ninety nine. See §2.2.3 for details. For cardinal numerals hundred onwards Bugun has the 
basic numeral wiam dog ‘one hundred’, see § 2.3.1; and Nocte has Hare ‘one hundred’, see § 
2.3.3 for detail. Going by Table 1 and Table 4 below we find Bugun and Nocte has twelve 
basic numerals; while Deuri has eleven, see $ 2.3.2. 


Table 4 — Cardinals twenty and hundred 


Bugun Deuri Nocte Gloss 
iff'ak kuwaffa ruwokni Twenty 
wiam da ya thé One Hundred 


2.2. Number Building 


According to Comrie (2005) and Mazaudon (2010), human languages apply an arithmetic 
base in constructing numeral expressions. The “base” of a numeral system means the value n 
such that numeral expressions are constructed according to the pattern ...xn + y i.e. some 
numeral x multiplied by the base n plus some other numeral y. The order of elements is 
irrelevant, as are the particular conventions used in individual languages to indicate addition 
and multiplication. In Bugun, Deuri and Nocte the numeral expressions are constructed 
according to the pattern ...7x + y. In some cases conjunctive markers are used to indicate this 


$ Tone variations are hardly noticed in Bugun, Deuri and Nocte due to language contact. As most speakers use 
Hindi, Assamese or Nepali in their everyday life, they have lost tone in their native languages. Also due to lack 
of active use of native language they can no longer distinguish tone variation nor can they use them. 
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mathematical process. In the following sub-sections we have detail of this process in Bugun 
82.2.1, Deuri 82.2.2 and Nocte 82.2.3. 


2.2.1. Addition in Bugun 


In Bugun for the addition process, a conjunctive marker na combines the numerals. The 
conjunctive marker na has a full form nana ‘and’. When numerals eleven to nineteen are 
built, the cardinal numeral sud ‘ten’ i.e. the base n combines with the conjunctive na ‘and’ 
and is followed by y i.e. the basic cardinals one to nine add up to the base numeral. For 
instance when dio ‘one’ follows sud na to form sud na dio ‘ten and one’ we have ‘eleven’. 
Native speakers use the weak form sna of sud na when they build the digits from eleven to 
nineteen by deleting the diphthong /uà/ to from a syllable initial cluster sna; as shown in 
Table 5. 


Table 5 — Bugun cardinal numerals eleven to nineteen 


Bugun Gloss Bugun Gloss 

sna dio Eleven sna rab Sixteen 
sna pin Twelve sna milija Seventeen 
sna m Thirteen sna milaja Eighteen 
sna wi Fourteen sna dige Nineteen 
sna kuó Fifteen 


In case of the cardinals 21 to 29 the conjunctive na follows the basic numeral hak 
*twenty' with the numerals 1 to 9. In regular speech the conjunctive na is dropped by the 
native speakers. 


Table 6 — Bugun cardinal numerals twenty one to twenty nine 


Bugun Gloss Bugun Gloss 

itf'ak na ió Twenty one itf'ak na rab Twenty six 
itf'ak na pin Twenty two itf'ak na milijá | Twenty seven 
itf'ak na im Twenty three | itfak na milójá | Twenty eight 
itf'ak na wi Twenty four | it/"ak na digé Twenty nine 
itf'ak na kua Twenty five 


2.2.2. Addition in Deuri 


In Deuri numbers 11 to 19 is built by addition of the basic numerals 1 to 9 to 10. Unlike 
Bugun, Deuri does not take a conjunctive marker. The prefix mu- affixes to the numbers built 
by this process. For instance muduga ‘ten’ combines with muza ‘one’ to build the cardinal 
*eleven'. In Deuri we see the base n is followed by y. 


Table 7 — Deuri cardinals eleven to ninteen 


Deuri Gloss Deuri Gloss 
muduga muza Eleven muduga muffu Sixteen 
muduga muhuni Twelve muduga mufiy Seventeen 
muduga munda Thirteen muduga muffe Eighteen 
muduga mudffi Fourteen muduga mudugu | Nineteen 
muduga mumuwa | Fifteen 
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Deuri numerals above 19 take the prefix kowa-. In Table 8 we have the cardinals 20 to 29. 
The cardinals from 21to 29 are instance of addition; where the basic cardinals 1 to 9 add up 
kowatfa ‘twenty’. This basic numeral is derived by multiplication kowa x t/a ‘20 x 1’ as 
shown in Table 8. The cardinal ‘twenty’ derived by this process is combined with the basic 
numerals ‘one’ to ‘nine’ to build the higher digits. In Deuri we have the instance of the base 
numeral n multiplying x and then y is added to build the higher digit. It is to be noted that to 
form ‘twenty’ the language uses one of the variants for ‘one’ ffa. And to form the next higher 
digit the other variant muza ‘one’ is used. Thus ‘twenty one’ in Deuri is formed by the 
addition of kowaffa ‘twenty’ to muza ‘one’; see Table 8. 


Table 8 — Deuri cardinals twenty to twenty nine 


Deuri Gloss Deuri Gloss 
kuwaffa Twenty kuwafa mumuwa | Twenty five 
kuwaa muza Twenty one | kuwaffa muffu Twenty six 
kuwaa muhuni | Twenty two | kuwaffa mu/fiy Twenty seven 
kuwatfa mugda | Twenty three | kuwaffa muffe Twenty eight 
kuwaffa mudffi Twenty four | kuwafa mudugu Twenty nine 


2.2.3. Addition in Nocte 


Nocte like Deuri does not take any additive marker to build the cardinal numbers eleven to 
nineteen. But unlike Deuri which retains the mu- prefix, in Nocte the prefix wan- does not 
affix to it/"i ‘ten’ from numbers 14 to 19 instead the vowel i- is dropped when the higher 
digits are formed by addition. In Nocte the base n is followed by y. Table 9 below shows the 
building of the cardinals 11 to 19 where the peripheral (h ‘ten’ combines with numerals 1 to 
9. 


Table 9 — Nocte cardinals eleven to nineteen 


Nocte Gloss Nocte Gloss 
fi want"é | Eleven firók | Sixteen 
UT wanni Twelve yi gid' | Seventeen 


UT wànram | Thirteen U sot Eighteen 


a beli Fourteen | f/ík"u ` | Nineteen 
If'íbaná Fifteen 


To build numerals from twenty to twenty nine the prefix ruwok- ‘ten’ multiplies with the 
base numeral ni ‘two’ to derive the cardinal ruwokni ‘twenty’. Nocte has two variants for 
‘ten’ itf^i a base numeral and ruwok- a prefix. The base numeral ni ‘two’ multiplies with 
ruwok- ‘ten’ to form ruwokni ‘twenty’. The derived cardinal then adds up with the core 
numerals | to 9 to build the higher numbers. In Nocte we see the base n multiplies with x and 
then y is added to build higher numbers. The prefix wan- is dropped for numbers 24 to 29. 


Table 10 — Nocte cardinals twenty to twenty nine 


Nocte Gloss Nocte Gloss 
ruwokni Twenty ruwokni bagá | Twenty five 
ruwokni want'é Twenty one ruwokni irók | Twenty six 
ruwokni wanni Twenty two ruwokni inid’ | Twenty seven 
ruwokni wànram | Twenty three | ruwokni isat Twenty eight 
ruwokni belt Twenty four | ruwokni (ct Twenty nine 
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From Tables 9 and 10, we see Nocte uses two ways to build compound numerals. In Table 


9, the base tfi i.e. n is followed by y and in Table 10 the base ruwok- i.e. n multiplies the 
numeral ni ‘two’ i.e. x to form ruwokni ‘twenty’ and then adds the other numerals i.e. y to 
build compound numerals. 


2.3. Multiplication 


Multiplication is a process to build up multiples of ten in Bugun and Nocte, which is typical 
of the decimal system. In Deuri we see multiples of twenty. 


2.3.1. Multiplication in Bugun 


In Bugun the multiples from thirty to ninety are formed when sa a variant of the numeral súã 
‘ten’ multiplies with the basic numerals ‘three’ to ‘nine’. The multiples of hundred are formed 
with the core numeral wiam ‘hundred’ followed by the numerals one to ten, i.e. n is followed 
by x as shown in Table 11 and Table 12. 


Table 11 — Bugun multiples thirty to ninety 


Bugun Gloss Bugun Gloss 
sa im Thirty sa milija Seventy 
sa wi Forty samilójá | Eighty 
sa kuó Fifty sa dige Ninety 
sa ràb Sixty 


Table 12 — Bugun multiples of hundred 


Bugun Gloss Bugun Gloss 

wiam dog One hundred wiam ràb Six hundred 
wiam jin Two hundred wiam milija_| Seven hundred 
wiam im Three hundred | wiam milajá | Eight hundred 
wiam wi Four hundred wiam dige Nine hundred 
wiam kua Five hundred wiam sud Thousand 


Native speakers use the word hazrai for thousand instead of wiam sud. 


The numeral 


hazrai is derived from hazar ‘thousand’, a numeral found in most Indo-Aryan languages. 


2.3.2. Multiplication in Deuri 


In Deuri the kuwa- morpheme multiplies with the base numerals 1 to 9 to give the even 
multiples: twenty, forty, sixty, eighty and hundred, indicating a vigesimal process, see Table 


13. 


168 


Table 13 — Numeral kuwa- and even multiples 


Deuri Gloss 

kuwa-ffa (20 x1) Twenty 

kuwa-kini (20x 2) Forty 

kuwa-yda (20 x 3) Sixty 

kuwa-ffi (20 x 4) Eighty 

kuwa-muwa (20x 5) Hundred 

kuwa-ffu (20 x 6) One hundred and twenty 
kuwa-fiy (20 x 7) One hundred and forty 
kuwa- ffe (20 x 8) One hundred and sixty 
kuwa-dugu (20 x9) One hundred and eighty 


11. Numerals in Bugun, Deuri and Nocte 


But when it comes to the odd numeral multiples thirty, fifty, seventy and ninety the Deuri 
resorts to division. In Dzongkha odd multiple numerals are derived as shown in (1) taken 
from Mazaudon (2010: 125). 


(1) Dzongkha 
khe pjhe-da pi: 20 '^ - 2“half score to 2 (scores)”, 30 
khe pjhe-da sum 20 !^ - 3 “half score to 3 (scores)", 50 


In Deuri the peripheral numerals: three, five, seven and nine suffixes to kuwa- to build the 
odd multiple numerals. In the derivation of the odd multiples a vowel or a consonant or 
syllable is dropped from the base numerals as shown in Table 14. Unlike Dzongkha multiple 
formation, Deuri odd multiples a derived by division. In Table 13, kuwa- multiplies with the 
base numeral muwa ‘five’ to give kuwamuwa ‘hundred’. In Table 14, with the dropping of the 
vowel we have kuwamu ‘fifty’. The same is true for the multiple thirty, kuwarda is sixty and 
kuwada with the drop of the velar nasal /y / becomes ‘thirty’. From Table 14 we see that the 
phoneme or syllable within parenthesis is obligatorily dropped to build the odd multiples. The 
odd numerals (30, 50, 70 and 90) are a combination of 20 + the (un-prefixed) base numeral, 
and that the ‘base numeral’ is disyllabic and one syllable is dropped - the first syllable in 
some case and the last syllable in others. Why this happens we don't know and this needs 
further investigation. 


Table 14 — Prefix kuwa and odd multiples 


Deuri Gloss 
kuwa-(y)da Thirty 
kuwa-mu(wa) Fifty 
kuwa-fi(y) Seventy 
kuwa-(du) gu Ninety 


Unlike Deuri, Bodo which also belongs to the Bodo-Koch subgroup (Burling 2003) uses 
multiplication to build both odd and even numerals 20, 30, 40, 50, 60, 70, 80 and 90. For 
example 20 is formed by multiplication of nai ‘two’ x di ‘ten’ = nati ‘twenty’; the odd 
number 30 is formed by multiplication of t'am ‘three’ x di ‘ten’ = tamdgi ‘thirty’. 

In Table 15 we have multiples of hundred where the core numerals multiply to build high 
digits. The core numerals multiply with kuwamuwa ‘hundred’ to construct the higher 
multiples. 


Table 15 — Multiples of hundred in Deuri 


Deuri Gloss 

kuwamuwa muhuni Two hundred 
kuwamuwa mumuwa Five hundred 
kuwamuwa muduga One thousand 
kuwamuwa kuwamuwa Ten thousand 


2.3.3. Multiplication in Nocte 
In Nocte ruwok- ‘ten’ multiplies with the basic numerals 2 to 9 to form the multiples. The 


numerals 20 and 30 are formed when ruwok- multiplies with the base numerals 2 and 3, and 
the multiples 40 to 90 are formed when the core cardinals, 4 to 9, multiplies with ruwok- 


169 


North East Indian Linguistics 7 


Table 16 — Multiplication in Nocte 


Nocte Gloss Nocte Gloss 
ruwok-ni Twenty | ruwok-irok Sixty 
ruwok-ram Thirty ruwok-inid’ Seventy 
ruwok-beli Forty ruwok-sat Eighty 
ruwok- bana | Fifty ruwok-ik^i Ninety 


The multiples of hundred are formed in Nocte by the multiplication of t/a ‘hundred’ with 
the basic numerals 1 to 9 see Table 17. For multiples of thousand the base numeral tfi ‘ten’ 
multiplies Haie ‘one hundred’ to form tfi tare ‘thousand’. 


Table 17 — Multiples of hundred in Nocte 


Nocte Gloss Nocte Gloss 

ffa-t'é One hundred ýa- irók Six hundred 
Ya-ni Two hundred | fa-iyit Seven hundred 
Ya-ram Three hundred | fa-iset Eight hundred 
ffa-belí Four hundred | ga-ik'i Nine hundred 
fa-bangnga | Fivehundred | gi ga-t'é | One thousand 


2.4. Multiplication and addition 


All the three languages use multiplication and addition to build up intermediate numbers 
between the multiples. 


2.4.1. Bugun multiplication and addition 


In 82.2.1, we have shown how Bugun takes the conjunctive na ‘and’ to build higher digits. In 
(2a) we find the postpositional marker reg which gives the reading ‘after that’? can occur in 
place of the conjuctive nd in the multiplication and addition process. In (2a-c) we have 
instances of multiplication and addition to build numerals like ‘thirty one’, ‘forty one’ and 
‘one hundred and one’. In (2a) we find a variation in the number building process where 
either na or reé occurs. In (2-b) the conjunctive na operates as the additive marker and in (2c) 
and (2d) reé acts as the additive marker with the reading ‘after that’. 


(2a) sa un na/ reé ió 
ten three ` and/POSP one 
Lit: ten three and one/ after that one 
‘Thirty One’ 


(2b) sa Wi na dió 
ten four and one 
Lit: ten four and / after that one 
‘Forty One’ 


(2c) wiam ka ree aid 
hundred one POSP one 
Lit: hundred one after that one 
*One hundred one? 
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(2d) wiam sud rêe dio 
hundred ten POSP one 
Lit: hundred ten after that one 
‘One thousand one’ 


2.4.2. Deuri multiplication and addition 


In Table 18 we have instances of the multiplication and addition of the even multiples in 
Deuri. 


Table 18 — Multiplication-addition of even multiples in Deuri 


Deuri Gloss 

kuwaffa muza (20x11) Twenty one 
kuwakini muza (20x21) Forty one 
kuwayda muza (20x3+1) Sixty one 
kuwaffi muza (20x4+1) Eighty one 
kuwamuwa muza (20x5+1) Hundred and one 


In Table 19 we have the multiplication-division-addition operating simultaneously to build 
the odd multiples in Deuri. In 82.3.2 Table 14, we have seen that Deuri resorts to division for 
building the odd multiples thirty, fifty, seventy and ninety. In Table 19 we find that three 
mathematical processes, namely multiplication, division and addition operate simultaneously. 


Table 19 — Multiplication-division-addition of odd multiples 


Deuri Gloss 
kuwada muza (20x3 + 2 + 1) Thirty one 
kuwamuwa muza (20 x 5 + 241) Fifty one 
kuwafi muza (20 x 7+ 2*1) Seventy one 
kuwagu muza (20 x 9 - 2 +1) Ninety one 


2.4.3. Multiplication and addition in Nocte 


In Nocte the cardinal prefix ruwok- multiplies with the base numerals 2 to 9 followed by the 
addition of a basic numeral to build up the numbers. In Table 20, we have examples of how 
numerals like twenty one, thirty one etc. are built in the language. For instance twenty one is 
formed in the language when the peripheral ni ‘two’ multiplies with ruwok which is 
equivalent to ‘ten’ then it adds the basic numeral want^e ‘one’ to form ruwokni want'e ‘twenty 


L 


one. 


Table 20 — Multiplication and addition in Nocte 


Nocte Gloss 

ruwok- ni want'é Twenty one 
ruwok- ràm want'é Thirty one 
ruwok- beli want'é Forty one 

ruwok- band want'é Fifty one 

ffa-t"é want'é One hundred one 
fa- ni want'é Two hundred one 
ffi fa-t"é want'é One thousand one 
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The number building process in Nocte shows that the arithmetical structure nx+y is 
consistently maintained in Nocte. In case of Bugun, addition and multiplication, it takes either 
an additive marker na or the Postposition reé to build compound numerals. Deuri which has a 
vigesimal system takes resort to addition, multiplication and division to build compound 
numerals. 


3. Ordinals 
Ordinals are words like first, second, third etc. In this section we shall look into how ordinals 
are formed in Bugun, Deuri and Nocte. 


3.1. Ordinals in Bugun 
Ordinals in Bugun are limited to three. In Table 21 we see the language has an ordinal 
equivalent to the English ‘first’; what we have for ‘second’ and ‘third’ are periphrastic 


expressions. 


Table 21 — Ordinals in Bugun 


Bugun Gloss 
ibi First 
before 

ai p'a dë Second 
one DEF next 

dai p'a do pa Third 
one DEF next DEF 


Alternatively, Bugun uses words like egó ‘first’, tuoi ‘middle’ and edõ ‘last’ as 
ordinals. For instance in the following sentences (3-4), we have ego ‘first’ in (3a) and ibi in 
(3b). Normally egó is used when there is time reference and ibi is used with reference to 


position. 


(3a)  ap'ua-ye gerie táp p'a teacher | ego 
Father our village POSP teacher first 
*Father was the first teacher of our village 


(3b) oe ka-ibi-fa rio 
S/he my-before-LOC stand 
‘She stood before me’. 
(Lit: She is first and I am second) 


(4) geé mufua ego do 
I-ERG tiger first see 
‘I saw the tiger first.’ 


With kinship words, the language uses the following ordinals: kau, gun and nu. Unlike 
some Tibeto-Burman languages say for example Singpho, have words for the 1% son, 2" son, 
3" son and 1* daughter and 2"! daughter; we do not see this pattern in Bugun. The Buguns do 
not name their children according to birth order. The kinship words iau, fun, nu etc are used 
to address an individual according to his or her status within the family. 
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Table 22 — Ordinals with kinship words 


Bugun Gloss Bugun Gloss 

abau Han | Brother first p'unu Han | Uncle first 

abau fug | Brother second p'unu fug | Uncle second 
abau nu Brother third (bast | p"unu nu Uncle third / last 


To count days Buguns uses time adverbials as shown in Table 23 below: 


Table 23 - Days in Bugun 


Bugun Gloss 

dijao Yesterday 

sodou Today 

timijan Tomorrow 

tado Day before yesterday 
bodo Day after tomorrow 


Iterative ordinals in Bugun are formed when the iterative marker Ion precedes the basic 
numerals as in Table 24. 


Table 24 — Iterative ordinals in Bugun 


Bugun Gloss Bugun Gloss 

lon ka Once log wi Four times 
loy nig Twice lo kus Five times 
lon im Thrice loy rab Six times 


3.2. Ordinals in Deuri 


Deuri uses the particle si to form ordinals in the language. The particle si follows the cardinal 
numeral to give the ordinal reading. The particle si has different functions. According to 
Kishore Deuri (PC) si can be a feminine gender marker. For example the adjective tamu 
means ‘deaf’ when the particle si affixes to tamu to form tamusi it means ‘deaf girl / woman’. 
Similarly the noun gira means ‘old man’ and girasi means ‘old woman’. The particle si 
functions as an emphatic marker in constructions as in (5) below: 


(5) ba si noni 
He EMPH doing 
‘It is he who is doing?" 


In Table 25 we have the ordinals of Deuri. 


Table 25 — Ordinals in Deuri 


Deuri Gloss Deuri Gloss 
muza si First muffu si Sixth 
muhuni si Second mufiy si Seventh 
muyda si Third muffe si Eighth 
mudffi si Fourth mudugu si Ninth 
mumuwa si | Fifth muduga si Tenth 


Iterative ordinals in Deuri are formed when the prefix *ma- affixes to the peripheral 
numerals. 
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Table 26 - Iterative ordinals in Deuri 


Deuri Gloss Deuri Gloss 

m H Once ma-ffu 6 times 
ma-kini Twice ma-fin 7 times 
ma-yda Thrice ma-fe 8 times 
ma-ffi 4 times ma-dugu 9 times 
ma-muwa 5 times ma-duga 10 times 


As in Bugun, Deuri too has ordinals that occur with kinship terms as shown in Table 27. 
Ordinals with kinship terms are derived with the particle si. 


Table 27 — Deuri Ordinals with kinship terms 


Deuri Gloss 

dema si pisa Elder son 
sosiba si pisa Middle son 
suruba si pisa Youngest son 


3.3. Ordinals in Nocte 


Nocte has one ordinal bang ko ‘first’? (6a). In (6b) and (7) we have examples of how 
periphrastic ordinals are formed in Nocte. 


(6a)  bdy-ko 
first-LOC 
*first 


(6b) oti k*adi-ko 
he after-LOC 
‘after him’ 


(7)  bay-ko ni wa-hi ako hok-t"à 
first-LOC we father-PLU here arrive-pst-3 
‘Our fathers came here first’. 
Some ordered Nocte kinship terms are shown in Table 28. 


Table 28 — Ordered kinship words in Nocte 


Nocte Gloss 

a-doy Elder brother 

adoy Father’s sister (both younger and older) 
a-ffe Second elder brother 

a-k^u Elder sister / sister-in-law 

pak'u pe Elder daughter 

mor pe Middle daughter 

nadi Youngest brother/sister 


Nocte speakers tend to drop the prefix a- of the ordinals ador, afe and ak"u. Nocte has an 
adjective doy meaning ‘big’. The ordinal a-doy refers to the elder brother or big brother 


174 


11. Numerals in Bugun, Deuri and Nocte 


(usually the eldest one). The ordinal ador also refers to father's sister (both elder and 
younger). The examples below highlight the difference between these words. 


(8) him doy 
house big 
‘big house’ 


(9) adoy ka-i-r-a affé kà-r-á-ma 
eldest brother come-CONT-FUT-3SG second brother come-FUT-3SG-NEG 
‘My eldest brother will be coming but the second elder brother will not come’. 


(10) adóg ` p'imu kā-t*a 
aunt phimu ` COME-PERF-3SG 
* Aunt Phimu has come? 


Iterative Ordinals in Nocte are formed when the ordinal marker œan- prefixes to the 
numerals. 


Table 29 — Iterative ordinals in Nocte 


Nocte Gloss Nocte Gloss 
dan-thé Once dgan-irók 6 times 
d;an-ní Twice d;an-igid" 7 times 
dan-ram Thrice d;an-isat 8 times 
a&an-beli 4 times d;an-ik'i 9 times 
dan and 5 times dgan-iff'i 10 times 


4. Fractions 


Fractions help us to understand how a language allows a numeral to be divided into smaller 
parts. 


4.1. Fractions in Bugun 


Table 30 — Fractions in Bugun 


Bugun Gloss 

Kio Half 

Kio wf pha aio One fourth 
half fourPOSP one 

k^io kud p'a aio One fifth 


half five POSP one 


k'io wil p'a im Three fourth 
halffourPOSP ` three 

dio rée io One and half 

one POSP half 

sud na digé rée Kio Nineteen and half 


ten and nine POSP half 


Bugun word for ‘half’ is kiyo. To indicate smaller fractions K"iyo ‘half? is followed by two 
basic numeral, the postposition p’a occurs between the basic numerals. As shown Table 30. In 
case of higher fractions like one and a half the basic numeral is followed by iio: the 
postposition rée occurs between the basic numeral and K"io. The postposition rée gives the 
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reading of ‘after that’. The Bugun word k*iyo ‘half’ and the smaller fractions are used by 
native speakers regularly, however though the younger generations prefer to use the Hindi and 
English equivalent. This is an impact of education and the dominant languages: Hindi, 
English, Nepali and others. 


4.2. Fractions in Deuri 
In Deuri to form the fraction ‘half? the peripheral numeral ya ‘one’ is preceded by dok. In case 
of one fourth and three fourth the base numerals combine to show the fractions. In case of 


higher numerals the genitive case marker jo. 


Table 31 — Fractions in Deuri 


Deuri Gloss 

dok go 

half one Halt 

gu ya One fourth 

four one 

gu fa nda 

T Three fourth (3/4) 
mufi jo mufu — fa Lx 8 

eight | GEN six one Six eighth(6/8) 
muduga muhuni jo mudugu ýa Nine/twelfth 

ten two GEN nine one (9/12) 

kuwa ýa -jo muduga muwa [fa | Fifteen twentieth 
twenty one GEN ten five one | (15/20) 


4.3. Fractions in Nocte 


In Nocte fractions are constructed with k’a ‘half? which precedes either a core numeral or a 
basic numeral and the postposition marker wa ‘from’. The postposition wa is juxtaposed 
between the numerals. The word ka ‘half’ is commonly used in everyday speech. But 
fractions like a bel: wa k'a né ‘half four from half one’ and the others cited in Table 31 are 
used as and when the situation arises but not frequently as a ‘half’. 


Table 32 — Fractions in Nocte 


Nocte Gloss 

kha Half 

k'a beli wa kta thé One fourth 

half four from half one 

k'a band wa k'a t"é One fifth 

half five from half one 

kta belí wa k'a rom Three fourth 

half four from half three 

pa- té k'a thé One and a half year 
year one half one 


5. Conclusion 
From our study of the numeral system in Bugun, Deuri and Nocte we find that the cardinal 


numerals of these languages determine whether these languages have a decimal system or a 
vigesimal system. Deuri and Nocte both belonging to the Bodo-Konyak-Jingpho group have 
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two different numeral systems. Deuri has a vigesimal system and Nocte a decimal system. 
The cardinal system in Bugun is a decimal system like Nocte however the two languages 
differ in one aspect: the basic numerals in Bugun are free forms whereas that of Nocte is 
derived by the prefixes wan-, ba- and i-. In this respect Nocte and Deuri are similar. In Deuri 
too basic numerals are derived with the prefix mu-. In Table 33, we have highlighted the 
similarities and differences in the numeral system of these languages. 


Table 33 — Similarities / differences in Bugun, Deuri and Nocte 


Features Bugun Deuri Nocte 
Decimal system T - i 
Vigesimal system - + - 
Prefix - 
Addition + 
Multiplication 
Division - + - 
Ordinals 
Iterative ordinals 
Kinship ordinals 
Fractions 
Loan words 

Abbreviations 

3sG 3" person singular 

ADD Additive 

CONT Continuous 

DEF Definite 

ERG Ergative 

FUT Future time 

LOC Locative 

NEG Negative 

PERF Perfect 

PLU Plural 

POSP Postposition 
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12. The counting unit system of Pnar, War, Khasi, Lyngam and its traces in 
Austroasiatic composite cardinal systems! 


Anne Daladier 
LACITO-CNRS 


Abstract Different attempts at reconstructing Austroasiatic (AA) numbers have proved unsuccessful although large 
data sets are available in all the different AA groups. AA cardinals are composite systems. As already 
guessed by Jenner (1976), the explanation lies in the fact that another number notion existed before AA 
use of cardinals. Through a large collection of data sets, collected first hand, I show evidence of this 
former AA number notion. I analyze it as ‘counting unit’ (CU) after the notion of "grouping" of 
Menninger (1934). Such a CU system is still well alive in Pnar, War, Khasi and Lyngam, a group of 
languages I define as Pnaric-War-Lyngam (PWL). This CU notion involves numeration bases depending 
on what is counted. It also involves different groupings of units depending on how goods of the same kind 
are grouped into specific containers for trades. This CU number notion also involves abstract entities like 
humans as clan beings or space and time units. 

Seven words for PWL CU's are widespread in AA numbers which came to be used as cardinals. One of 
the main results of this study is that their transformation into cardinals depends on their numeration 
bases as CU's. Vestiges of different numeration bases in AA cardinals which are typical of PWL CU's are 
still found in many different AA groups. 

*kur, a vigessimal CU in PWL, widespread in Munda languages as ‘twenty’, has also been borrowed as 
‘twenty’ in Sanskrit, in Nortth-Eastern Dravidian languages and, as advocated here, probably also in 
PTB. Matisoff (2003:416) reconstructs *m-kul ‘all, twenty’ in PTB. I relate this TB root to a core AA root 
*kur ‘clan, human being, human “nationality” as clan generated group, multitude (as a benefit of clan 
generation), twenty (as the total of the fingers of the hands and feet of a human being)’. I analyse the 
prefix m- in AA cardinals as a reduction from AA man ‘one’. 

Because of its CU system still in use, PWL has rather innovative cardinals. PWL has two cardinal 
systems, a Pnaric one and a War one. PWL cardinals confirm independent historical and typological 
data: Khasi appears to be an offshoot of Pnar while War and Lyngam are not Pnaric. PWL is very 
quickly ‘khasifying’, on a post-colonial refuge territory. 

PWL CU system and CU words together with conservative AA grammatical features in “core” War, Pnar 
and Lyngam varieties suggest an early settlement of the ancestor of PWL in the lower Brahmaputra 
valley. 


Historical change of number notions is an interesting example of the deep relativity of iconicity and thus 
of the relativity of the very notion of cognate, on a long term scale. However this technical question on 
historical reconstruction raises interesting issues about conceptual calques and language internal 
derivations. 
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1. Introduction 


As already noticed both by Zide (1976, 1978) and, for different reasons, by Jenner (1976), the 
reconstruction of an Austroasiatic cardinal system of numerals could not be achieved. For 


! Deepest thanks to Rofinus Jat, Nickey Nongsiang, Woh Monti Pohtam Chernia and Lakhmie Pohtam Sohsley 
for their information on respectively Pnar, Lyngam and War cardinals and counting units. All the data on these 
three different languages are collected firsthand and recorded in the context of village life. Woh Monti 
introduced me to divination counting techniques, to the making of different kinds of measure-baskets and to 
different kinds of good bindings together with their counting units. I am also grateful to Linda Konnerth. I take 
responsibility for the viewpoints and for a few necessary arithmetical technicalities. 
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Zide (1978), only mere “educated guesses” could be offered though he conducted the most 
important first-hand documentation on cardinals in all the different Munda languages. His 
guesses on what might be historically phonologically related as AA cardinal cognates show 
discouraging discrepancies, as he himself admits. Zide (1976:5) incidentally notes that traces 
of different numeration bases, especially ‘two’, ‘four’, ‘five’ and ‘twenty’, are found from 
East to West in most AA languages. He does not relate this finding to the existence of a 
notion of number involving several numeration bases, a grouping way to count according to 
what is counted, used before the number notion of cardinal. In a very illuminating way Jenner 
(1976:51), after the deep intuitions of Coedés (1942) and thanks to Old Khmer inscriptions, 
opened the way for such a solution, showing that a former “collective” notion of number 
using different numeration bases can be sketched in early Khmer inscriptions. 

In the light of these two preliminary negative and positive findings, I will show here that 
the notion of cardinal number can be used only very indirectly to trace back Austroasiatic 
(AA) number systems and their actual isoglosses, and that a former notion of number still 
alive in PWL enables us to understand indirectly the main features of AA cardinal systems 
and present AA number words. 

In §2, I will summarize how Menninger (1969) shows that “grouping” a number notion 
sensitive to what is counted and as such involving different numeration bases, precedes that of 
cardinal. Such “groupings” are so much grounded on easy ways to perceive quantities of 
abstract entities that they still coexist with cardinals even in modern European languages, 
especially for time and space notions. Some space units still use body parts in English, like 
inches and feet. We still do not quantify time in decimal cardinals. Small units are counted in 
base sixty: 60 SECONDS= ONE MINUTE, SIXTY MINUTES = ONE HOUR. Then cyclic units of 24 
hours are grouped into a day, cyclic units of 7 days into weeks etc. using different numeration 
bases for different time unit cycles. Finally when we perceive the precise time value of one 
year, we do not perceive it in terms of seconds, but in terms of months and we have notions 
like “seasons” which enable us to compare yearly processes. In the same way we construct 
representations to compare processes which may involve very large and very small time 
scales. I will specify further this “grouping” notion in terms of “counting units” (CU’s), a 
number notion fit for combinations of different units with different numeration bases and 
meant for easy perception or calculations on large quantities of goods, space or time, without 
writing devices. 

There is no published data on “groupings” in AA but as shown in §3 a few “groupings” 
are found in Old Khmer inscriptions , as well as in Santali and in Mundari; a few genuine 
groupings are described in Khmer and in Munda by Coédes (1942), Jenner (1976), Maspero 
(1915), Bodding (1932-7) and Hoffmann (1998). I will summarize the main findings 
especially of Jenner and Przylusly showing remnant traces of numeration in bases ‘four’, 
‘five’ and ‘twenty’ typical of the PWL CU system in different AA cardinal systems. I will 
also describe *kur and *pon PWL CU words found in Sanskrit, also *m-kur ‘many, twenty’ in 
PTB probably borrowed from AA *kur ‘clan, prosperity, plenty, twenty’, with m derived from 
PWL CU *mon. *mon is widely used as ‘one’ in AA cardinals both in multiplicative pre- 
posed position as in English ‘one hundred’ and in the additive post-posed position of ‘twenty 
one’. The importance of additive and multiplicative positions in numbers is described in $2 
and $4. 

In $4, I analyze in detail the CU system of PWL with its numeration bases and its 
arithmetical compositional rules. 

In 85, I analyze in detail the two cardinal sets in PWL: Pnaric and War. 

In $6, I analyze the composite features of PWL cardinal words. Some cardinal words 
appear to be reduced forms of additive or subtractive expressions on other cardinal words. 
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This feature appears in other AA languages. Old Khmer cardinals are analyzed as an already 
derived composite system by Jenner (1976). His main results are summarized. 

In 87, I take up the existing comparative analyses of AA cardinal words and connect them 
to PWL CU words and in a few cases to rather recent Sinitic borrowings, eventually via Tai or 
Thai and one TB borrowing. 

In 88, I show how AA cardinals result from composite number systems. AA cardinal 
words ‘one’, ‘two’, ‘four’, ‘five’, ‘ten’, ‘twenty’ are often phonologically derived from PWL 
CU words according to their numeration base. Cardinal words ‘seven’, ‘eight’, ‘nine’, which 
do not correspond to numeration bases in the PWL CU system, are disparate, sometimes 
borrowed, sometimes showing traces, as in PWL, of reduced additive or subtractive 
expressions on ‘five’ or on ‘ten’. 

In §9, I summarize a few historical and typological data on Khasi, Pnar, War and Lyngam 
and analyze in this context the common PWL CU system and the two Pnaric and War sets of 
cardinals in PWL. I show how number words confirm the label Pnaric-War-Lyngam defined 
in Daladier (2011) which replaces Khasian or Khasic scientistic “classifications”. 

As the most common AA cardinal words in AA and all the vestiges of numeration bases 
are related to PWL CU words and CU numeration bases, 1 conclude that the PWL CU system 
is a pretty conservative AA numeral system. 

As unfortunately there is, and there will be, no available complete data sets of CU’s in AA 
(mainly oral) languages, except for PWL, the proof of an AA notion of CU’s providing the 
main source for AA cardinals is organized along the above mentioned eight sections as eight 
lines of argumentation converging to the conclusion. Each line is detailed with as many data 
as available, in order to convince the reader to follow this method up to the conclusion. 
Having no conventional means of historical reconstruction on this large historical scale, I 
solicit the patience of the reader before he reaches the conclusion. Another important point 
which cannot be addressed here is the cultural and religious meaning of the PWL counting 
notion. CU’s are important in everyday life and commercial activities but “counting” in its 
lexical PWL sense, is central in divination and ritual performances. As strange as it may seem 
for linguists, counting together with its divination techniques is central as a way to classify 
things and beings and as a way to think and to perceive the world in the War religion’. It is a 
way to structure the perception of the world. It might help the reader to know that this kind of 
subjectivity is not alien to scientific methods. From an abstract mathematical viewpoint, in 
modern (constructivist) mathematics, numbers are not primitive values but on the contrary are 
defined as resulting from different kinds of numeration functions. A work on common 
features of an AA religion with detailed explanations on their abstract ontologies and their 
poetics is in preparation. 


2. Different notions of number including cardinal and grouping (or CU) 


In his history of numbers in the main cultures, Ifrah (1998) traces back how the current notion 
of cardinal number, the Hindu-Arabic decimal cardinal notion, is based on a positional zero 
(i.e. according to its position, “zero” assumes different values: zero, ten, hundred, thousand 
etc.). For Ifrah, the discovery of cardinals might follow the use of Brahmi decimal numbers. 
Cardinal, a written number notion with its positional zero, allows easy calculation techniques, 
especially on large numbers and does not depend on what is counted. According to Ifrah, it 


7 This view is not as unconventional as it may seem. In this respect the etymology of ‘number’ from Latin 
numerus is quite interesting. According to Ernoult and Meillet (1932), numerus means to be in a certain 
category, to classify, to belong to a certain religious or social rank before it means ‘to enumerate’. It also means 
rhythm, measure and numbers as magic lucky / unlucky devices. These meanings are widely associated with 
number notions in many cultures, see §2. 
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probably originates in Jain Buddhism, in the sixth century AD with the Lokavibhaga text; the 
very word ‘zero’ is derived from Sanskrit sunya *emptiness". The Persian mathematician al- 
Khwarizmi synthesized Hindu and Greek knowledge on arithmetic and transmitted it in 825 
AD in his book on calculation to Arabic mathematicians. 

From an ethno-linguistic view point, the values of this positional “zero” might be 
compared with the different functional values of a word according to its position in a 
sentence. Standard word order varies in languages and varies in diachrony but in a given 
language, a lexical element may be interpreted as: subject, object, nominal, verbal or topic 
according to its position. The discovery of a positional zero takes place in a culture much 
interested in precise grammatical analyses and terminologies. About four centuries before the 
Lokavibhaga text, Patanjali takes up and develop all the grammatical work of Panini, fixing 
classical Sanskrit, see Cardona (1997). I come back with precise examples on the interaction 
of numbers and grammatical uses about two different numbers “one” in PWL in $4. 

Different kinds of decimal numbers without a positional zero also appeared in different 
cultures, for instance in India, Brahmi numerals around the third century BC are a decimal 
positional system. In China, the late Shang Dynasty, around the 14^ century BC, already had 
a decimal system, but these numerals do not use a zero, and they involved latter on counting 
boards for large calculations. These numerals were found on divination tortoise shells. 
Vandermeersch (2013) gives a fascinating analysis of the relationship between counting and 
divining in Chinese around the 8" century BC and offers the hypothesis that ideography came 
into being as a divination means at that point. 

In Meghalaya (NE India) counting techniques are still central in the Ancestor religion, in 
its fertility rituals and in its different divination techniques, especially in my still unpublished 
collection of Nongbareh War rituals. 

Linguists should be aware of ethnocentric generalisations on cognates. The notion of 
cardinal number is not a universal cognitive notion but an arithmetical, cultural and historical 
one. It appears late, in written cultures, as opposed to the act of counting and grouping similar 
objects into combinable units. This is very well shown and documented by Menninger (1934). 
He distinguishes two main notions of number which are related to two kinds of cognitive 
representation of numbers: a) a notion of "grouping" which refers to a class of objects and 
depends on what is counted; b) a cardinal notion, free from what is counted and usually in 
decimal base. Menninger shows how in different cultural ways, the cardinal notion is 
preceded by the grouping one, especially in oral cultures. Menninger shows especially how 
grouping number words may trace back internal evolutions into cardinal words. I will show 
how PWL CU's are related to many AA cardinals according to their numeration bases. Some 
AA cardinals alien to AA numeration bases are borrowed from neighboring languages and 
some others are reduced from additive or subtractive expressions on current quinary or 
decimal CU's. There is no AA word for ‘zero’ and the IA word sup is used in PWL as in 
Munda. 

Counting units are meant to combine into higher units. In a concrete way, PWL CU 
usually corresponds to specific packages for carrying, selling, or retailing goods, often with 
specific baskets or specific packagings and bundles. More abstractly, counting units are 
combinable classes of classes (or combinable units of units) with specific enumeration bases. 
Specific classes of objects have specific combination rules to produce higher units, using 
different numeration bases, described in $84. A CU is an arithmetic category with its 
arithmetical combination rules and enumeration bases labelled with proper unit words. In 
turn, the choice of CU categories with their combinations rules and numeration bases depend 
on cultural classifications of enumerable objects. For example in War, counting betel nuts and 
oranges may be done using the same units with their combination rules because betel nuts and 
orange trade is the main business of the Wars and that the price per weight, though not equal, 
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is on a similar scale. When trading for retail on market places, similar big bamboo baskets are 
used. 

Counting units are sensitive to what is counted and then to cultural classifications but they 
are not number classifiers. 

The arithmetic of combination rules of the different CU into higher ones with different 
numeration bases, according to counted objects has a deep reason, not a linguistic one but an 
arithmetical and a cultural one. Operations on large numbers of a definite good are performed 
easily without written techniques because they are not fully evaluated; they are directly 
visualized or perceived as quantities by the representation of their traditional containers, see 
$5. The same is true for time units in English, like 'month' or 'year': when we add years, we do 
not count nor perceive the number of hours they contain. This is all the interest of having 
units of units. 


3. Cardinal numbers with their remnant traces of CU with numeration in bases 4, 5, 20 
in AA and area borrowings 


Jenner (1976) makes very interesting remarks on the fact that already Old Khmer numerals 
sometimes refer to units of specific goods and sometimes to cardinals. For example trabaak 
*package of 20 paan leaves' and s/ik *package of 400 pan leaves' are groupings but they are 
also used as cardinal numbers 20 and 400 respectively. Maspero (1915:304-6) describes other 
CU in Khmer in bases ‘twenty’, ‘sixteen’, ‘ten’ and ‘four’, some of them preserved from Old 
Khmer. All these numeration bases are still used in PWL CU, see $5. 

The findings of Jenner throw a pervading light on PWL secondary uses of some CU's as 
cardinals. I will try to show that some apparent discrepancies in the values of these numerals 
are in fact due to some period of time where meanings fade, numeration bases change and 
finally disappear with the cultural domination of cardinals. For example, PWL has a 
quaternary CU *hali: ‘set of four inanimate objects’ which is an interesting corresponding 
form, less two powers of ten, of the Old Khmer number slik ‘400’ of paan leave. Both number 
words are derived from the name of the leaf in AA. Old Khmer has slik ‘leaf’, War sli ‘leaf’, 
Pnaric sla:. In War, halt: is a quaternary CU of betel nuts, citrus or eggs while the meaning of 
the CU hali in Pnaric has faded and now is a quaternary CU of any inanimate thing. The PWL 
unit *hali: was most probably a quaternary CU for paan leaves as hali: /hala: ‘leaf? is still 
found in many AA languages in addition to s/a:/ sli: ‘leaf. All MK and Munda groups have at 
least one of them (Shorto 2006:230). It seems interesting to derive further the AA root for 
‘leaf? with two possible minor syllables: [sV] and [KV/khV/hV/?V/ Ph] la? (see Daladier 2011). 
Shorto traces back ‘leaf’ in: Aslian, Jahai hali?, Jah-Hut hla?, Kensiw hali?; in Bahnaric, 
Bahnar bio: Cua hla:, Nhaheun hla, Sedang hid, Tampuan A/a:; in Kane, Katu Pala: Nge, 
Phla:, Pacoh Pula:; in Khmuic, Khmu A/a?; in Monic, Nyakur A/áa?, Mon hla?; in Pearic, 
Kasong khla:; in Munda, Juang olag, Sora ?o:la:. Finally, slik ‘400 (of paan leaves)’ in Old 
Khmer is probably a cognate of a MK quaternary CU for paan leaves before being used as the 
cardinal ‘400’. Interestingly, in PWL both sesquisyllabic forms: [AV] li and [sV] lai: ‘leaf? 
coexist while [AV] /i has specialized and frozen as a quaternary unit no longer perceived as 
being related to AA words meaning ‘leaf’. 

I show in $5 how quaternary, quinary, decimal and vigesimal numeration bases may be 
used alone in simple CU's or may combine in higher CU's in PWL. Traces of quinary and 
vigesimal bases are found in cardinal numbers of most Munda languages, see Zide (1978) and 
in many MK languages; Jenner (1976) describes vestiges of a quinary base in Old Khmer 
with an AA number word fi/ta ‘hand’ (5 fingers). Old Mon and Aslian also have a vestigial 
quinary base in their decimal cardinal number words with another AA number word *soy 
‘five’ also found as a quinary PWL CU, as shown in $4.2 below. 
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One of the most interesting AA CU is the vigessimal *kur; it has been widely borrowed in 
Sanskrit, Indo-Aryan and Dravidian languages either as a grouping 'set of twenty' or as 
cardinal ‘twenty’ and also as a kind of quantifier ‘plenty, many’. It also has a secondary use 
as cardinal twenty in South Munda languages where cardinals have traces of numeration in 
base twenty. Finally it has a renewed use as cardinal ‘ten’ or teens in Pnaric, and in some 
Palaungic, Khmuic and Bahnaric languages. 

Turner (1999: 3503) analyses after Przyluski (1929) IA ko:di ‘score, twenty’ as an AA 
borrowing, as it cannot be derived from Sanskrit. Assamese, Nepali, Bengali, Oriya, Gujarati, 
Marathi and Kashmiri use kuri/kudi ‘score, set of twenty’. The meaning ‘score’ of IA kodi/ 
kur from AA CU *kur ‘set of twenty or vigessimal set of sets’ (i.e. set of sets counted by 
twenties) opposes to the cardinal words for ‘twenty’ derived in IA from Sanskrit vingati. 
Przyluski (1929:26) states that Bengali uses both bis < vincati and kuri or kodi for ‘twenty’. 

Derived forms of kur: kuri, kore, kodi, kade ‘twenty’ and vestiges of its use in vigesimal 
base indeed appears all over South Munda languages: Juang, Remo, Sora, Gta?, Korwa, 
Gorum, Gadaba and Kharia, as shown by Zide (1978:53, 63, 55) and further dialectal data 
provided by Gosh on Gta?, Rau on Gorum, Donegan and Stampe on Sora?. kad/kade cardinal 
‘twenty’ in Gorum and kode ‘twenty’ in Gadaba are used as vestiges of a vigesimal number 
system as shown by Zide (1978:61), see for example in Gorum: mi-kad (lit. one-twenty) 
‘twenty’, bar-kad (lit. two-twenty) ‘forty’ etc. 

The AA cognate *kur ‘twenty’ is not Dravidian, it does not appear in the Dravidian 
Etymological Dictionary of Burrow and Emmeneau (1961) but it is borrowed in Northern 
Dravidian languages now in contact with AA, Bodish and IA: ‘twenty’ Kurukh (Palamau) 
kuri, Malto kor and in Western Dravidian languages in contact with Munda groups: Kuvi 
and Kui kode, Pengo kudi, see Grierson (1967:646-649). For Tibeto-Burman, in a related and 
most significant way, Przyluski (1929:28) states that in Upper Burma, the TB Siyins, also 
referred to as Sizang, North-eastern Kuki-Chin, who still have a TB number system have 
borrowed kul ‘score’ from AA *kur CU in base twenty. 

Matisoff (1997, 2003) and (Benedict 1972) taken up by Bradley (2005: Table 2) 
reconstruct *m-kul ‘twenty’ in PTB. In my opinion m- is most interesting here as it most 
probably confirms Przyluski’s hypothesis of a borrowing from AA vigessimal *kur (as early 
as in PTB). The frozen prefix m- is widely used in different AA cardinals (especially for 
‘five’ and ‘twenty’) using AA CU words as a reduced form from AA mi/ muj ‘one’ in AA 
cardinal names using CU words, see §8, like in Gorum: mi-kad (lit. one-twenty) ‘twenty’ or 
like m-sun (one-five) ‘five’ in Old Mont, me-say ‘one-five’ in Semlai (Aslian) as a remain of 
numeration in base ‘twenty’ or in base five, see §7. 

Further, the hypothesis of an inverse borrowing in AA from TB is most unlikely not only 
because kur ‘twenty’ is found in most AA languages, but mainly because the meaning 
‘twenty’ of kur is related to a former meaning found in all AA groups. The value ‘clan, clan 
descent and by extension multitude, also human being as a being belonging to a clan’ 
accounts for the associate meaning of *kur ‘twenty’ given by Przyluski (1929): it is a property 
of humans to have twenty fingers (hands and feet). It is very unlikely that the main clanic 
meaning of *kur in AA might be a PTB borowing. It is found on former Pnar territories in the 
Karbi Anglong (Karl-Heinz Grüfner p.c.; see maps in Daladier 2014). In addition, a last 
argument for a borrowing from AA to PTB m-kul is the fact that an etymon kur, kul etc. 
‘twenty’ does not seem to exist in Sinitic or Chinese languages according to current 
dictionaries, and according to the ongoing reconstruction of Sagart and Baxter. 


3 These unpublished data are available at http://lingweb.eva.mpg.de/numeral/; their authors are reliable sources. 
^ m- in AA cardinals is a frozen prefix derived from AA mi/ muj ‘one’ as a multiplicative positional element and 
cannot be confused with a sesquisyllable. 
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Quite interestingly, Matisoff (2003:416) shows the distribution of *m-kul in TB languages 
of N-E India: Garo, khol, khal, Meithei kul, Karbi ig-kol, *m-kul in dozens of Kuki-Chin 
languages, Angami meku, Ao Mongsen mukyi. 

In Indo-Aryan, Turner (1999: 3498) also takes up the derivation of IA Ko:fi ‘ten million’ 
derived from AA kur, from Przyluski (1929) and derive by hyper-sanskritisation the 
widespread IA element kror ‘crore’, multitude’ from this AA element *kur which usually 
denotes in AA a set of twenty but also a notion of multitude. As shown in $5, PWL *kur is 
quite often used as a unit of sub-units and may denote a big number. Matisoff (2003) adds the 
meaning ‘all’ to ‘twenty’ for m-kul, another quite relevant similarity with AA kur. So PTB m- 
kul ‘twenty, many’ seems to have been borrowed from the AA vigesimal CU *kur. 

All these data suggest an early AA use of *kur as a vigessimal CU associated with its 
original meaning of 'clan, descent group, human being (a being whose life is defined by 
clanic organisation)' because, as noted by Przyluski, a human being has two hands and two 
feet hence twenty fingers. Pnar, Khasi, War and Lyngam have kur ‘clan, maternal lineage’, 
Munda hor, horo, koro ‘person’ and Shorto (2006:1708) mentions Old Mon kirku:! ‘clan, 
family’, Bahnar khu:l descent group. As indicated by Guilleminet (1959:401), Bahnar also 
has khul to group, to classify beings or things’. 

The use of *kur as a unit of goods counted in vigessimal base has been followed by its use 
as cardinal twenty in a rather large area of: Munda, TB, North Dravidian and IA groups in 
contact for trades. 

Though it might look surprising at first glance, it is probably the same AA word *kur 
‘clan, human being, multitude, vigessimal CU, twenty, crore’ which has been later on used as 
cardinal ‘ten’ in a few northern AA languages. It may be less surprising when we see the uses 
of the AA word for ‘hand’ used both for ‘five’ and for ‘ten’ in AA, see AA cardinals below. I 
also guess so because the very same set of derived forms is observed in both cases: kor, kul, 
kode, kad, gal etc. for ‘ten’ and for ‘twenty’. According to Luce (1985) word lists, we have: 
kor ‘ten’ in Palaung, sahar ‘ten’ and kol ‘tens’ in Riang-Sak, kol ‘ten’ in Wa (Palaungic), gul 
‘ten’ in Mlabri (Khmuic), kul ‘ten’ in Cua (Bahnaric) and Zide (1978) gives Santali gal ‘ten’. 
Pnar and Khasi, which still use kuri as a vigessimal CU use the frozen Har. ‘teen’ in their 
cardinals. Very interestingly what might be an innovation for ‘ten’ or ‘teens’ is shared by a 
few North Munda, Bahnaric, Khmuic, Palaungic and Pnaric languages. Interestingly War, 
South Munda languages, Mon and Aslian do not share this innovation. 

So, as I guess, *kur,*kad becomes either ‘ten’ or ‘teens’ in some North Munda, PWL, 
Palaungic, Khmuic and Bahnaric cardinal systems perhaps by a chaining historical process of 
micro-contacts. 

The intricate history of *kur somewhat repeats with the PWL quadrennial CU pon. As 
noted by Przyluski (1929:xiv-xvi), AA pon is borrowed in Sanskrit pana a quadrenial unit of 
4x20 kauris ; kauri (or cauri) coral beds or shells are used as a kind of money (Trikandasesa, 
III, 2, 206) and in Bengali as a quadrennial unit of cardinality 20x4= 80 for kauris, betel 
leaves and bundles of paddy as it is still used in Munda and in PWL, see §5. pon has also been 
renewed as cardinal ‘four’ in many AA languages, see §8. It is not used as cardinal four in 
PWL where it is still used as a quaternary CU. In the same way kur is not used as cardinal 
‘twenty’, probably because the CU kuri is still widely used in PWL. 


4. Counting units and their numeration bases in PWL 
Before explaining how CU’s transform into cardinals, we must understand a little bit what 


CU’s are and what it means to “count” with them. Though used in oral languages it is far 
from a "primitive" notion. Before entering the details of PWL CU’s, I present here two 
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important properties of these numbers which will play an interesting role when explaining 
cardinal value shifts of CU words. 

Counting units always have a fixed numeration base but often have variable cardinality 
(i.e. number of elements) depending on what is counted, as shown in this section. The 
consequences of this most important feature for AA cardinal shifts are discussed in $8 and $9. 

There are different arithmetical uses of cardinal ‘one’ which may correspond to the uses 
of three number words for the current cardinal value ‘one’ in PWL. In other words, the 
cardinal word ‘one’ has different mathematical uses which are disambiguated in PWL with 
*mi, XUI and diep. diep described in $5.1 is a set containing one element (i.e. a CU of 
cardinality one), mi is mainly used as cardinal one, /i/fi is mainly used to count *one' for any 
CU, for measures and for powers of ten. For example, in War D p^u:a ‘ten’, lit. ‘one-ten’, fì 
swa? ‘one hundred’, see Table 1. mi and fi are used in: fì swa? mi ‘one hundred one’. Those 
two different uses of /i and mi correspond to the fact that there is no tradition of positional 
cardinals in PWL. /i/fi expresses ‘one’ in different measure units: /i khup ‘one breadth-of- 
four fingers’. /7/ffi is also used as ‘one’ for units of time, e.g. the whole day, one month length. 
mi ‘one’ to express ‘one o'clock' contrasts with /i to express ‘one hour’ as a unit of time. /i/fi 
may also be used for a kind of unit whose value is not defined numerically, as in War /i dit ‘a 
little while’, /i kur ‘people from the same clan’, fi pero ‘brothers and sisters from the same 
mother’. More precisely, /i/fi has a qualifying use as opposed to the quantifying use of mi. A 
notion of indefinite multitude can be associated to different cardinal numbers, see §5. /i/fi is 
also used as a kind of aspectual quantifying device, as in War: /i pam (lit. ‘one cut’), ‘to cut in 
one blow’. 


4.1. Counting paan leaves in vigessimal, ternary and quaternary bases 


Paan leaves are packed together in a package of twenty leaves called up. Then kufi packages 
are grouped by three into bia?. Then those bia? are grouped by four into one ksep. Then the 
ksep are grouped by twenty into one kuri. Finally, two kuri make a full ha? basket sold to a 
retailer of paan leaves on the market. The detailed conversion of these CU enumerated into 
decimal cardinal values is described below. The point here is that the perception of the value 
of ha? is free from the representation of individual leaves just like the linguistic perception of 
‘year’ is free from the representation of seconds. Both of them are ultimately conceived as 
sets of sets of these basic elements counted in different bases but this is arithmetic and not 
linguistics. 

diep in War and in Pnar is a minimal CU, it amounts to one paan leaf (1.e. diep is a CU of 
cardinality one); /i diep represents one CU of one element, not the cardinal ‘one’ paan leaf 
which is expressed with the cardinal mi ‘one’ as mi sli: patha: lit. ‘one leaf paan’.. 

Twenty leaves are bundled into //7// ‘one’ kuffi. One kuffi in War and in Pnar, one fuli in 
Lyngam has a cardinality of twenty (leaves). bia? in War and in Pnar is a CU of three kufi; its 
cardinality (number of elements) is 3x20 = 60 ( paan leaves) 

ksep in War, pon in Lyngam is a CU of four bia?; bia? is a vigessimal CU of 3x20 = 60 
elements. one pon has a cardinality of 4x60— 240 (paan leaves). 

In Lyngam, gayas is a CU of 4x4 pon = 16 pon for paan leaves. 

pon in Pnar, pen in War, is a quaternary CU which combines with a vigessimal CU; there 
are different kinds of vigessimal CU like kuri and paj, see below. 

The quaternary CU *pon is used for many current goods like betel nuts, bundles of 
firewood, chillies in Pnar, in War and in Lyngam. It is also used as a quaternary CU which 
also combines with a vigessimal CU in North Munda. In Santali pon isi is used as a CU of 
four scores that is 4x20= 80 elements, see Bodding (1937:644-645, Vol. 4). Mundari has pon 
as a CU for counting silk cocoons in base 4x20; one pon = 20 gandas with a cardinality of 
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20x4 = 80, Hoffmann (19987:3388). As indicated by Przyluski (1929: XIV), numerals pon 
‘80° and ganda ‘4’ in Bengali have been borrowed from Munda. Before Bengali, Sanskrit 
already had borrowed from AA gandaka ‘a way of counting by four and a coin of four cauri 
shells’. The Lyngam quaternary CU gayas is cognate to this Sanskrit loan. As noted by 
Przyluski, the use of cauri shells as money is not an IE custom. It is characteristic of a 
maritime civilisation; it might have developed on the shores of the Indian Ocean and the 
China Sea where AA languages were dispersed. In the 13" century this money was widely 
used in Bengal, according to Przyluski. 

Another interesting point here is that there is no CU or cardinal word which might be 
related to ganda in Pnaric nor in War, which adds an element to isoglosses analysed in 
Daladier (2011) showing that Lyngam groups have probably remained in the vicinity of North 
Munda groups in the Gulf of Bengal after the separation of South Munda groups with Pnaric, 
Lyngam and War. As shown in $8, the AA quaternary CU pon has been transformed into the 
cardinal four in most AA languages. 

kuri is a CU of twenty ksep in War; its cardinality is 20 x 240— 4800 paan leaves. 

In Pnar and in War /fi//i ho? (the full long and light special basket for paan leaves) 
contains two kuri = 2 x 4800 = 9600 paan leaves. This ha? basket also amounts to two gaga5 
in Lyngam. 

Old Khmer has slik ‘400’ for paan leaves. PWL has kani, a CU of cardinality 400 for 
betel nuts. s/ik and kani have a cardinality of 400 but they are different CU's. CU of paan 
leaves and CU of betel nuts are used again inside higher specific units of these goods. 


4.2. Counting citrus and betel nuts in combinations of CU’s in bases ‘two’, four, five, ten 
and sixteen 


pəntrə? in Pnar, pantro:u in War, is a quaternary CU of four hali, that is a CU of cardinality 
4x4=16 pieces of citrus or betel nuts. 

bhar is a CU in base two, a CU of two pantra?. Its cardinality is: 16x2=32 pieces of citrus, 
betel nuts or eggs. 

soy in Pnar is a unit of five pantro?. Its cardinality is: 5x16=80 pieces of citrus etc. 

lonti/ luti is a CU in base 4x4 for betel nuts of pantra? bar. Its cardinality is: 16x 32 = 
512 pieces of betel nuts. 

In War ta:, in Pnar and in Khasi kti, ‘hand’ has cardinality ‘ten’ (like ten fingers of both 
hands) for betel nuts. 

pu is another decimal CU, a CU of 10 ta: ‘hand’ in War or 10 kti ‘hand’ in Pnar and in 
Khasi. Its cardinality is 100 betel nuts 

kani is a quaternary CU of pu that is 4 pu that is 400 betel nuts 

pantra? kant has 16x400 that is 6400 betel nuts 

paj is a vigessimal CU of twenty Kant that is 8000 betel nuts 

Lyngam Trei uses the CU dara to count by fives. This CU is not used in Pnar, Khasi or 
War. 


4.3. Counting bundles of firewood, kauris and dry chilies in combinations of quaternary 
and vigesimal bases 


bdi is a CU of 20 pieces of bundles of firewood, kauris or chillies in Pnar and in War. 

pon in Pnar and in Lyngam, is a CU of 4 bdi, that is a CU of 4x20=80 pieces of bundles of 
firewood, cauris or chillies. 

ka:o in Pnar, War and Khasi has 16 pon (or pen) = 16x80 = 1280 pieces of bundles of 
firewood, cauris or chillies. 
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In War, eggs are counted and sold by fours in hali like citrus or in bdi by twenties and also 
by eighties (4x20) in pon. 


4.4. Sensitivity to what is counted and shifts of values when CU words are renewed as 
cardinals 


The notion of CU is sensitive both to what is counted and how it is counted, especially how 
goods are packaged for gross and retail trade. For example citrus are retailed by fours, paan 
leaves are tied together in units of twenty leaves for retail sale. Twenty pieces are not named 
the same if they refer to paan leaves or to fresh fish. However the same unit may be used to 
name different quantities of different things counted within the same enumeration base. One 
kuri of fresh fish amounts to 20 fish but one kuri of paan leaves amounts to 20 ksep that is 
4800 leaves. In Pnar and in War, one pon of paan leaves amounts to 240 pieces but one pon of 
chillies or bundles of fire wood or kauris amounts to 80 pieces. 

As described here, there are several vigesimal units in PWL which may depend on the 
counted goods or which may depend on the sub-units which are counted, like kuri for twenty 
ksep and kuffi for twenty diep of paan leaves. In the same way, there are several units of 
cardinality 80 depending on how this number is calculated, which involves the kind of goods 
counted. 

There are different counting units in base sixteen in PWL depending on what is counted. 
For example in War, jhap represents sixteen for children, born from the same mother but 
pantro:u ‘sixteen’ is used for citrus or betel nuts. In Lyngam ka:o is used for sixteen children 
born from the same mother. In War, Pnar and Khasi, ka:o represents 16 pen or pon of bundles 
of firewood, cauris or chillies. 

The number of elements of a CU, that is its cardinality, varies, it depends on what is 
counted but its numeration base does not change. As shown in §8 renewals of CU words into 
cardinals depends on their numeration base. In other words, in their original uses, CU's are 
unrelated to cardinal values but when cultures change and the most used are transformed into 
cardinals, they usually take the value of their numeration base (e.g. Kn ‘hand’; ‘a way to count 
specific things by fives’ > ‘five’). 


5. Thetwo cardinal number systems in PWL: Pnaric and War 


Table 1 presents cardinal numbers of different varieties: Jowai Pnar (JP), Ralliang Pnar (RP), 
Standard Khasi (SK), Langkymma Lyngam (LL), Kudeng Nongbareh-Nongtalang War (KW), 
Nongbareh village War (NW), and Thangbuli Amwi War (TW). mi://fi (or wi:// ffi) represents 
the contrastive pair for ‘one’ analysed in $4. mi or wi is used for ordinary cardinal ‘one’ while 
ffi or fi is used to express one (thousand), one (hundred), one (ten), that is to count ‘ten’ and its 
powers. 

In $7 and 88, I will analyse the similarities between PWL and AA cardinals and link many 
of them to PWL CU and also link some of them to cardinal loans from TB, Chinese and IA. 

In Pnar the loss of /m/ or /b/ in onset position of monosyllabic words is frequent, as in mi 
> wi (one), ba > wa (dependency marker). This loss is usually not found in Khasi and in 
Lyngam. This shows that the Khasi and Lyngam cardinal systems have been borrowed from 
the Pnar one. 

Pnar cardinals slightly differ from the War ones. Numerals expressing 1, 2, 3, 4, 5, 6, 10 
have common roots but 7, 8, 9 and teens are different. 

hn-/ khn- /kan- is found in numbers expressing seven, eight and nine in AA, including 
Pnaric and War. In War, Ant'la: ‘seven’ contains hn- and la: ‘three’. This hn-/ khn- /kan- 
formative might be related to an AA subtractive frozen element from expressions expressing 
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those cardinals as ten less one or two or three, where either the word for ‘ten’ or for ‘one’, 
‘two’, ‘three’ would be dropped after the expression is frozen. Examples of such frozen 
reduced building blocks are analysed in PWL in next section. 


Table 9 — Cardinal numbers in PWL 


JP RP SK LL KW NW TW 
1 wi:// tfi wi:// ffi wej // fi owo//fo |mi//fi mi://fi mi: // fi 
2 Kë ar KÉ "air "A är 709 70) ùr 
3 lg: lg: laj laj-re la:/ laj la: la:/ le: 
4 S2: So: sa:o sa:o-re re:a ria sia 
3 san san san san-do ran ran san 
6 hnru hndru hnri:u hara: trou trou t^ro:u 
7 hnpa:o hnpa:o hnneu hnju-re hnt'la: hnt*la: hnt^la:/ hntle: 
8 pra: pra: pra: pra:-re hmp?3 hmp?üo hmp?ü 
9 Kinde kënde: k"ndaj k'ndaj-re | hnf?a: hnf?a: hnf?e: 
10 ffi p^a:o ffi p^a:o fi p'e:u tfo p^u: fi p^u:a fi phu:a fi ptu:a 
11 Kat wi: Kat wi: Kat wej Kat wo Si p'or mi Si p'ar mi Si p'or mi 
12 Kat Var Kat Zar Kat Zar Kat Zar Si phor "5 fipher?üo | fi por i 
15 Kat san Kat san kat san kat san J pian ran | fipřənran | fi p'on san 
20 "arr p^a:o "a:r p^a:o ?a:r pheu ?a:r p'u: Sr p'u:a "Hor p'u:a ?ür p^u:a 
31 le: p^a:0 wi: | le: p^a:o wi: | laj p^e:u wej | laj pu: wo | laj p^u:a mi: | la: p"w:a mi: | la: pia mi 
100 | tfi spa? ffi sp^a? fi spa? fa spa? | fiswa? fi swa? fi swa? 
1000 | tfi hads ar ffi hadgar Si hadgar ffohadgar | fi hadgar Si hadgar Si hadgar 


k'nde:, k"ndaj ‘nine’ in Pnaric might be related to such a reduced frozen building block 
referring to subtractive expressions involving the word ‘hand’ in Pearic, Khmuic, Palaungic 
and Old Khmer. The word ‘hand’ is still used in PWL CU with cardinality ten. ¢a:/ kti in 
PWL represents a CU of cardinality ten for betel nuts, see $4.2. In Temiar, Aslian, ‘ten’ is 
expressed by jas-tik ‘complete hands’. 

The number word for ‘eight’ in Palaungic-Waic, in Khmuic and in Pearic seems related to 
K^ndaj ‘nine’ in Pnaric: khanta:j ‘eight’ in Samre, ka:ti: in Chong, Pearic; In Khmuic Mlabri 
ti? ‘eight’ and Tayhat kantaj ‘eight’, see Thomas (1976:70) and ndaj in Tung Va, tai’ ‘eight’ 
in Wa, Waic, ta? in Palaung (Pan-ku), see Luce (1985). Jenner (1976:44; 48) also mentions for 
‘eight’ in Old Khmer (in some uses) kati: and reconstructs *kti: ‘eight’ in Proto-Khmer. So, 
most interestingly, the name of the hand may then express, eventually with frozen and 
eventually zeroed prefixes, ‘five’, ‘eight’, ‘nine’ and ‘ten’. 

In War, the cardinal hmp?Pia ‘eight’ also looks like a frozen expression ‘ten less two’ that 
is hm-p-Piia lit. ‘less (from) ten two’. hm- might be related to hn-/ khn- /kan- as a secondary 
phonological development of a reduced frozen subtractive expression before -p- as a 
reduction of pu ‘ten’ before the glotatized vowel of "Ga ‘two’. 

pra: ‘eight’ in Pnaric has no cognate in AA; it might be related to pru?/ p^ru: a length 
measure used especially for the length of cubic shaped baskets in Pnaric and in War. 

swa? ‘100’ in War is most probably a late borrowing from IA, see Bengali sa: and Desya 
soa ‘100’. This borrowing opposes Pnaric spa? ‘100’, which for Henderson (p.c. to Shadap- 
Sen 1981) might be TB. 

hadar ‘thousand’ is IA, probably borrowed in PWL from Bengali. In many North and 
South Munda languages like Santali, Gorum, Khana, the cardinals for ‘100’ and ‘1000’, are 
also borrowed from IA. 

PWL as other AA present numerals are positional cardinals as in English. Cardinals 
higher than ‘one’ are used pre-posed to express tens and powers of ten in a multiplicative way 
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while cardinals used post-posed are used in an additive way. For example, in War: laj swa? 
‘three hundred’ opposes /i swa? laj ‘one hundred and three’. 


6. Cardinals expressed with reduced forms of frozen arithmetical expressions or affixed 
frozen number classifiers 


In S. Khasi cardinals 28 and 29 are currently expressed as reduced forms of ‘30-2’, ‘30-1’; 
*30' is a widespread symbolic lucky number in PWL narratives: 

‘28’ laj přeu nar < laj pieu nadu? ?ar ‘three tens less two’ 

‘29° laj peu nawej < laj peu nadu? wej ‘three tens less one’ 

Gorum, a south Munda language, also uses a subtraction device, in a frozen reduction 
form gab ‘ten’ to express ‘nine’ as ‘ten less one’, see Zide (1978:1;61). 

In War, counting units may be used with lower cardinal numerals: ‘one’, ‘two’, ‘three’, 
within different expressions meaning that a quantity is added or removed from a unit. For 
example with bal: a unit of four betel nuts or citrus or eggs and with kuri a unit of twenty paan 
leaves: 

fi hali la mi ‘one CU of four-pieces, one (piece) extra’ has the cardinal value of five 
pieces, with Ja < lea ‘go’ being a grammaticalized serial element expressing the fulfilment of 
the hearer’s expectation. This is explained by the fact that citrus are retailed by four and one 
cannot buy five oranges. To ask for five is a way to bargain five for the price of four. The 
seller usually answers Saz hali la mi (lit. two set of four, one extra) that is nine oranges for 
the price of eight. fi hali in Pnar, fi hali in War, fə hali in Lyngam is ‘one’ CU of four pieces 
of citrus, betel nuts or eggs. ‘Two’ ball, a set of eight (oranges) is expressed by "ur hali with 
the cardinal number ^u:r ‘two’ in War (see Table 1) derived from the CU number bhar in base 
two, a CU of two pantro?. 

mi han.dap fi kuri ‘one piece not fulfilling one unit-of-twenty’ (paan leaves) has a cardinal 
value of 19 (paan leaves). Here han is a negation which constructs with dap ‘(to be) full’. 
PWL dap ‘full’ is perhaps related to the MK cardinal dap ‘ten’, found in modern written 
Khmer in additive forms, see Jenner (1976:48) and in Bahnaric, see Thomas (1976:80). 

Two or three number classifiers are used for humans, cattle or goods in Pnaric, in War and 
in Lyngam. In Pnar and in Khasi mu is used for people, təlli for goods, dur for pairs of 
animals. In War be is used as a number classifier for people, as a suffix according to 
intonation, Ahlon for whatever is not human. They add the idea of being a pair or a triad etc. 
for the counted elements, which may be in context a set of exactly two, or exactly three etc. 
elements; for example in War: "ár.be ^i hun ‘a couple of children; both children’, /aj.be i hun 
‘a triple of children; the three (of them) children’; "09 khlon "i kwoj ‘a couple of betel nuts’. In 
Pnaric as in War, number classifiers are post-posed to cardinals. 

Lyngam Langkma has a peculiar feature, its cardinals combine with suffixes -re and -de 
from ‘3’ to ‘9’ to count anything, -re for ‘3’, ‘4’, ‘6’, (‘7’), ‘8’, ‘9’ and -de for ‘5’. After nine, 
Lyngam uses Pnar classifiers, yut as a classifier for people and təlləj as a classifier for both 
animals and goods. The suffixation of former classifiers -re and -de from ‘one’ to ‘nine’ is a 
common feature between Lyngam and Gta? in South Munda. Zide (1978:57) shows that in 
Gta?, -re and -de are suffixed to cardinals to count people (as opposed to cattle and goods) in 
the following way: -re/rwa is added to cardinals ‘3’, ‘4’, ‘7’, ‘8’, ‘9’, ‘10’ and -de/-da to 
cardinals ‘5’, ‘6’. This suggests a previous contact between Lyngam and Gta? before the 
Lyngams settled in Meghalaya and adopted the Pnar cardinals and after they remained 
isolated from the group which became PWL when they settled in Meghalaya and from 
southern Munda groups when grouping numbers in base five were still used. Lyngam has 
isoglosses with North Munda which are not found in Pnar, War and Khasi (see Daladier 2011) 
and has probably remained in contact with later Kherwarian groups in the Gulf of Bengal. 
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*ra: is a classifier for people in West Bahnaric, derived from ra:? ‘big, adult human’, see 
Adams (1992:110-111). -r is used in Munda and in PWL to denote inhabitants of places, as in 
Kherwar or in War (wa-r ‘people of the floods, rivers and incantations’). 


7. Comparison of cardinal numbers in AA and their connection to PWL CU's 
7.1. ‘One’ *miin PWL 


*One' is expressed as *mi in PWL cardinals probably derived from mon; in War mon is a CU 
of one load of 100 kg of edible seeds; most AA languages have a related cardinal one, *mu:;j, 
*muaj, *mu:n Shorto (2006:1495), (Luce 1985: vol 2, chart A). 

In Munda, Sora has muj/mi; Gutob, Remo muj; santali mit; Ho, Mundari mid; Zide 
(1978:35-73) reconstructs *mi "one" and also notes in Zide (1976) the positional use of mi, as 
in Gorum mi kad ‘one twenty’ (like one in English one hundred). 

*mu.j, *muaj, *mu:n, *mi used as cardinal ‘one’ is then associated with another use for 
counting ‘one’ quinary or vigesimal CU. mun/ miad/ mon is found with different quinary 
PWL CU words like *ta ‘hand’, e.g. in miad’ ti ‘five’ lit. ‘one hand’ in Turi, Munda. It is 
found as a frozen prefix of cardinal ‘five’ in many other Munda languages: Mundari mon- 
reya, Korku mon-oe, Gorum mon-loj, Sora mon-loj ‘five’. Accordingly, it is found frozen as 
m- / ma- as a vestige of quinary CU in msun ‘one-(set of) five’ in Old Mon; in Aslian, Semelai 
has mesor "five also expressing one-(set of) five. Jenner (1976:58) traces back the use of this 
frozen m- in epigraphic Old Khmer m-bhai, with bhai a quinary unit with a secondary use for 
‘five’, ‘ten’, ‘twenty’ or ‘one thousand'.*mi can also be used as a counter ‘one’ for vigesimal 
numbers, like in Gorum (Munda) mi-kad ‘one-twenty’ and for tens. 


7.2. ‘One’ Sgr in PWL 


XUI is used in PWL cardinals as ‘one’ for powers of ‘ten’ expressed as one-ten, one-hundred 
etc. while mi is used as simple cardinal: fi hadzar mi ‘one hundred and one’. In AA usually mi 
is used like ‘one’ in English both pre-posed as ‘one’ for powers of ten and post-posed as an 
additive *one'. 


7.3. ‘Two’ *"arin PWL 


PWL CU bor: AA *bar ‘two’, see Shorto (2006:1562) 
bar (‘bar/*bar/’ba/ubar) > bhar > "a:r/u:r//a:xr > "är ‘two’ in Kudeng War 

Most Munda and MK languages have *bar for the cardinal two, see Zide (1978), Diffloth 
and Zide (1976) and Thomas (1976). 


7.4. ‘Three’ */a: in PWL 


The usual form in MK languages is pe according to Thomas (1976:67-68): Mlabri (Khmuic) 
has pae’, Bahnaric, and Pear, Katuic have pa./ pe./ pae/ paj ; Mon has paa? ; Palyu has paj”; 
in Munda Zide (1978) shows that Santali has pe, Ho ape, Mundari apiya, Korku ap^ai, Korwa 
pei, Turi pea, Kharia u?phe/uPfe. 

Palaung has le, Wa loj, lue?, Parauk Wa loe, Lemet lohe and Central Nicobar Joel lu:e, 
(Luce 1985: vol 2, chart A). Thomas (1976:68) tries unconvincingly to derive the /a:, loj form 
from pe. AA */a: might be a borrowing from /i a ternary unit of length measure of 300 bu or 
360 bu used in several Chinese dynasties with shifts in its value along a large scale of times, 
widespread from Shang dynasty, 16-11" century B.C., to modern times. Coédes (1989) 
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states that the Chinese length measure /i is widespread in Hinduized kingdoms along the 
Chinese commercial maritime route. 


7.5. ‘Four’ PWL *si: 


All AA languages have *pon ‘four’ except PWL probably because it is still widely used there 
as a CU. 

‘Four’ *saw in Pnar, Khasi and Lyngam and *si:a in War is very rare in AA; it appears in 
a few Khmu varieties in China and in Laos. Archaic Chinese has sied ‘four’, Luce (1985: 
vol.2, chart W quoting Karlgren). It seems to be a Chinese loan via Thai si: ‘four’; War 
probably has borrowed *si; ‘four’ from Tai Ahom. Pnaric has transformed Zei: into *sa:, 
which is a regular correspondence between Pnaric and War e.g. ‘leaf’ sli: in War, sla: in Pnar 
and Khasi. Tai si: ‘four’ > si:a in Thangbuli (Khorvi) War and ri:a in Nongtalang War; sa: > 
saw > s2: ‘four’ in Pnaric. 

As cardinal ‘four’, *pon is found in Monic, Mon pon, Nyah Kur pan; Old Khmer pvan; 
Aslian, Semlai Ampon; Bahnaric, Stieng puon, Rengao pun, Nhyaheun pan, Loven puan; 
Katuic, Bru pó:n, Pacoh poan, Khmuic, Khao po:n, Mlabri pon;Viet-Muong, Muong pon; 
Palungic, Palaung pu:n, Parauk Wa pun; Palyu Lai pu:n; Munda, Santali pon, Korku ap^un, 
Turi punia, Kharia i?pon/ iPfon; Car Nicobarese fe:n, see Bodding (1937: 644, Vol.4), Thomas 
(1976:67-68), Zide (1978). 


7.6. *Five? *san in PWL 


AA *soy five’; PWL CU soy ‘five’ pantra?. say/soy as cardinal ‘five’ is found in Aslian, 
Katuic, Khmuic, Bahnaric and Monic but apparently not in Munda (Zide 1978). 

‘five’ say in Loven (Bhanaric); with m- as a reduced form for ‘one’, m-sun in Old Mon, 
me-sor in Aslian, Semelai; the use of m- shows that cardinals are counted in base five in Old 
Mon and in Aslian, this is also the case in Turi (Munda) with the name of the hand, see 
below; (pa) soy in W Bahanaric, Katuic, Khmuic and Monic (Thomas 1976). 

The reduced form san/ ran ‘five’ as cardinal ‘five’ is probably used instead of soy in PWL 
to avoid ambiguities. 


7.7. ‘Six’? dru in PWL and [tV] dru in AA 


War has iron ‘six’ and Pnaric Andru ‘six’. 

North Bahnaric has *tadraw; West Bahnaric, Nyaheun tro, Loven tra:o, Rengao tudru; in 
Monic, Nyah Kur has traw, Old Mon has turow, see Shorto (2006:1851). Thomas (1976:70) 
adds as cognate pru? ‘six’ in Semelai and prau in Viet Muong. In Munda, Sora has tudru, 
Korku, Ho and Santali have turui, Mundari turia; Gorum tur-gi, Gta? tur-da, Zide (1978:35- 
73). Matisoff (2003: 149) reconstructs *d-ruk ‘6’ in PTB. The cardinal ‘six’ *[ ] dru in PWL 
and [tV] dru widely found in AA looks strangely close to PTB. There is no numeration in base 
six in any of the AA languages, no PWL CU related to this etymon and no numeration base 
six in any PWL CU. I have no hint for this interesting question which I leave open. 


7.8. ‘Ten’ *p"u in PWL, Pnar p"a:o, S. Khasi peu, War p"u:a, Lyngam pu 
As seen in $5.3, pu is a decimal CU for counting betels nuts of ‘ten hands’ that is 10x10= 100 
betel nuts. The PWL decimal CU pu is perhaps related to Old Khmer quinary element bhai 


used as a quinary CU to count sets of 5, 10, 20 or 100 elements. Apparently, modern Khmeric 
and Pearic languages express the cardinal ‘twenty’ with this element and a ‘one’ prefix: 
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Khmer ma-phij, Northern Khmer mi-phej, Pear and Suoy ma-phej, Chong p^a-*j, see Thomas 
(1976). It seems that modern Khmer, Pearic and PWL languages use the quinary O. Khmer 
element with two shifts like the vigesimal CU kur is renewed as cardinals ‘ten’ and ‘twenty’; 
also the word ‘hand’ in AA is renewed as cardinals ‘five’ and ‘ten’. Hence the word ‘hand’ 
with its derived forms ta: in War and kti in Pnaric is used as a decimal CU of betel nuts in 
PWL but it is used both as cardinal ‘five’ (one hand) in Munda and as cardinal ‘ten’ (both 
hands) in Temiar, Aslian: jas-ti:k ‘complete hands’. The cardinal ti ‘five’ appears in Turi, 
Munda. The cardinals in Turi are not decimal but enumerated in base five: miad ti ‘one hand’ 
‘five’; miad ti miad ‘one hand one’, ‘six’, see Zide (1976: 40). 

The words for cardinals ‘seven’, ‘eight’, ‘nine’ usually do not use directly CU words 
unless using frozen reduced arithmetic expressions; it appears that these numbers have not 
been used as numeration bases in AA. 

The PWL element *#i used to count ‘one’ especially for powers of ten, as seen in Table 1, 
is most probably related to a borrowed element for ‘ten’, powers of ten and for teens in AA. 
Jenner (1976) shows that cas /cos ‘ten’ in Old Mon and then cah /coh ‘ten’ in Middle Mon has 
corresponding forms in Bahnaric and Katuic (with final f): 


(1) - git Fat ‘ten’ in: Alak, Bahnar, Biat, Cil, Haling, Jeh, Kaseng, Kóho, Phnong, Rengao, 
Sedang, Sre, Sué, Stieng 
(2) - cit/cat ‘ten’ in Boloven, Brou, Kontu, Kuoy, Lave, Pacoh, Praok, Sedang 


Jenner (1976:46) analyses this element as a loan from a Chineese cardinal ‘ten’ found in 
modern Thai and Tai sip ‘ten’ borrowed from Chinese by Thai, AA and Indonesian groups. It 
is relatated to Old Chinese of Karlgren 3’ap; ts?iet ‘ten’ at some Fu-nan period; tsay ‘ten’ 
reconstructed by Bradley (2005:Table2) for South-eastern PTB and */syay ‘ten’ reconstructed 
by Matisoff 1988: [STC 7408] for PTB. 

In PWL, kuri vigessimal unit is widely used, with a cardinality which depends on what is 
counted, as seen in $5. As cardinal twenty, it is found in South Munda: Gta?, Korwa, Gadaba, 
Turi, Gorum, Kharia. Sora builds its cardinal numbers within a vigessimal base in kuri up to 
400. 

When cardinals and CU words are related, cardinals are phonologically derived from CU 
words. For example in PWL, the three number words are used both for CU and for cardinals: 
mon CU > mi cardinal; bar CU > "a:r/ “air! °3/ "än "ü l"ürl "iia cardinal; sog CU > san/ran 
cardinal ‘five’; pu CU of ten ta: /kti (hand) converted into p^ua/ p'eu/ přou / přu cardinal 
‘ten’. CU have been renewed into AA cardinals according to their numeration base for ‘one’, 
‘two’, ‘three’, ‘four’, ‘five’, ‘ten’ and ‘twenty’. 


8. AA composite cardinal systems and shifts of cardinal values in AA number words 


Disparities on AA cardinals may be due to areal borrowings when cardinals come into use or 
may result from different CU words having the same numeration base. AA cardinals often use 
more or less frozen prefixes from reduced additive or subtractive expressions to express 
numbers ‘seven’, ‘eight’, ‘nine’ which, as it turns out, do not correspond to numeration bases 
in CU. ‘Seven’, ‘eight’, ‘nine’ are often expressed as additive expressions on ‘five’ or 
subtractive expressions on ‘ten’. For example, AA *fi:? ‘hand’ (Shorto 2006:66) is used for 
cardinal ‘five’ in Turi and Gorum, Munda and for cardinal ‘ten’ with prefix jas ‘complete’ 
(finger hands) in Temiar, Aslian. It is also used for cardinal ‘eight’ in Palaungic and in Pearic, 
Chong and Samre with a kV- prefix presumably standing for a reduced arithmetic expression 
as seen in §8 for PWL cardinals ‘seven’, ‘eight’, ‘nine’. 
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For cardinals derived from counting units, their words may vary because as shown in $5, 
there are different counting units in each numeration base, according to what is counted and 
how it is collected, and because there are units of units in the same base. In PWL, ta:/ kti: 
‘hand’ has the decimal use of a unit of ten betel nuts but as opposed to Semelai, it is pu, a 
higher decimal CU of ten hands of betel nuts which is renewed for cardinal ‘ten’, in PWL. 

Whether or not CU words are retained in cardinal systems seems less related to cardinal 
values, arithmetic or linguistic reasons than to conventional uses and area diffusion in trades. 
For example the conservative AA word ‘leaf used as a quaternary CU of cardinality four 
hundred in Old Khmer and as a CU of cardinality four in PWL did not remain as an AA 
cardinal ‘four’, as opposed to pon, another widespread quaternary unit (of lower units having 
cardinality *240" for paan leaves and ‘80’ for chillies). The cardinality of a CU word (i.e. the 
number of its elements) is not relevant anyway, as opposed to its numeration base, for its 
renewal as a cardinal word. 

A last reason for disparities is the borrowing of cardinals from IA and Sinitic languages. 
An interesting point is the absence of Arabic number words in PWL while the Moguls settled 
very near in Bangladesh. 


9. CU, cardinals and the classification of Pnaric-War-Lyngam, usually called Khasian 


A brief overview of Khasi, Pnar, War, and Lyngam languages, with their core and many 
mixed varieties forming a Pnaric-War-Lyngam (PWL), rather than a Khasian group, is given 
in Daladier (2011, 2014). The analysis of PWL CU and their comparison with AA cardinals, 
as well as the two Pnaric and War cardinal sets, where Khasi and Lyngam cardinals appear to 
be offshoots of the Pnar cardinal set, as shown in $6, Table 1, confirm this view. Comparisons 
between Khasi and random Pnar, War and Lyngam varieties based on Swadesh lists are 
biased as Pnar, War and Lyngam have been pervaded by Khasi after Khasi became a lingua 
franca fluently and obligatorily? spoken by “Khasi” citizens in the East Meghalaya 
constituency. Written Standard Khasi is incidentally the vehicle of mass Christianisation 
under and following the British colonisation. More than 90% of the so-called Khasi 
population in Meghalaya is Christian and Christian “Khasis” consider Hindu and Muslim 
religions as back stage. 

As a result, Khasi language and Khasi institutions are very quickly linguistically and 
politically unifying this group and give it some national identity in the state of Meghalaya. 
The very recent spread of the Standard Khasi language (after Welsh missionaries and British 
political dominance) has already produced many mixed varieties, especially: Pnar-Khasi, 
War-Khasi, Lyngam-Khasi or Khasi-Pnar-Lyngam among others with various TB languages 
on the fringes of the Khasi constituency. Actually the majority of the AA population in 
Meghalaya speaks mixed varieties, see Daladier (2014). Core varieties of Lyngam, War and 
even Pnar are now greatly endangered by Khasi. Urban Jowai Pnar is a written variety, close 
to Khasi compared to eastern rural varieties. Historical information about a rather important 
Pnar kingdom settled in Assam and in Bangladesh who took partially refuge in Eastern 
Meghalaya in the 15" century and about a much smaller Khasi merchant group mentioned for 
the first times around the 17^ and 18" centuries by British and French traders and who took 
refuge west of the Pnars in the 18" century, is traced back by Shadap-Sen (1981) using 
precise information in the Ahom chronicles. The Pnars were interacting with the Tai Ahom 


5 Racism heavily corrupts “national” feelings and perceptions about “man kind" in NE India. Standard Khasi is 
spoken by Hindu people in Khasi, Pnar, War and Lyngam market places in the Khasi state while so-called 
“tribals” have to speak Assamese or Bengali in Assam and in Bengladesh, just a few kilometers beyond 
Meghalaya state frontiers. The word “Khasi” is nowadays strongly claimed as an identity both as a common 
nationality and as a common language by all the Pnars, Wars and Lyngams in Meghalaya state. 
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before the Khasis or Kosya appeared to be named as a group, probably by plain people. No 
internal etymology can be given so far by linguists to Khasi, as opposed to Pnar (< pan-nar 
*power-makers', Lyngam ‘incantation’ and War (wa-r ‘people of the rivers-incantations). As 
shown especially by the district map of Kharakor (1951), the Khasis still were a minority 
group compared to the Pnars just before the British rule. 

The term Pnaric-War-Lyngam also looks more appropriate than Meghalayan, as 
Meghalaya is a recent refuge for this group, after the Mogul and Tai Ahom kingdoms settled 
in Assam and in the Bay of Bengal in the 14" century AD. There are still Lyngam and War 
communities from Bangladesh joining their clan mates in Meghalaya, especially since the 
partition of India and Bangladesh in 1974. 

The core varieties of Pnar, War and Lyngam have rich systems of polyfunctional 
grammatical particles and they share the property of being mostly isolating languages, with 
little derivational morphologies and without verbal bases. AA languages like Old Khmer, 
have very rich derivational morphologies, see Jenner and Pou (1980). These conservative 
languages can also be opposed to some Bahnaric languages, like Stieng varieties, spoken in 
Cambodia, which have lost most of their derivational morphology and AA grammatical 
particles but renewed isolating features with grammaticalised lexical elements, see Bon 
(2014). 

PWL appears to be very conservative from a morphological view-point, though Khasi has 
transformed some of the particles and affixes of Pnar, developing a part-of-speech 
morphology with a growing use of specialized affixes for nouns and verbs and verbal bases 
with subject-verb agreement. The PWL CU system appears as one of those AA conservative 
features not found in other AA groups but matching vestiges, see for instance my article on 
argument marking and the genesis of verbal systems, in this volume. 

To sum up this section, recent and very recent sociological facts should not be mixed up 
with linguistic procedures useful for the reconstruction of a conservative North-Eastern AA 
group. Those sociological facts together with sound historical data should be carefully taken 
into consideration. A few available archaeological data on TB and AA megaliths in Assam, in 
the gulf of Bengal and in Cambodia might also be useful for some sound carbon 14 datations 
probably more reliable than phylogenetic "proofs" on highly problematic Pnar, War and 
Lyngam data collected in the main urban centre or with speakers mainly speaking Khasi. The 
comparison of PWL CU words and AA cardinal words confirms other linguistic comparisons, 
especially PWL and AA negations (negation words and multiple negative grounding values), 
see Daladier (to appear). War is closer to AA languages near the sea: Mon, Aslian and Sora 
than Pnaric while Pnaric is closer to Khmuic and Palaungic (Pnaric especially has a shared 
innovation with Khmuic and Palaungic, see $6). The number classifier -re for humans on 
Munda Gta and Lyngam Langkma cardinals, which is a rather recent use of a former AA -r 
for people's names, interestingly shows that Lyngam has remained in contact with Munda 
since earlier times than Pnaric and War groups. 

War and Lyngam languages had already diverged from a common ancestor of PWL, 
probably more than what can be now observed, before they took refuge in Meghalaya. This 
guess is grounded on first hand peculiar lexical data (not yet published) recorded in 
Nongbareh War and in Lyngam Trei oral texts involving features of common AA cosmogony 
representations. 


10. Conclusions 
Seven PWL CU words: mon, bar, pon, sor, kti/ ta, pu and kur have been renewed according to 


their numeration base as cardinals ‘one’, ‘two’, ‘four’, ‘five’, ‘ten’ and ‘twenty’ respectively 
in many AA languages. They are actually the most widespread AA number words. Used as 
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cardinals, the words: *mon ‘one’, * bar ‘two’, *pon ‘four’ are found in nearly all MK 
languages, see Thomas (1976: 67-68); for PWL, see $5, Table 1; and for many Munda 
languages, see Zide (1978:35-73). PWL quinary CU and cardinal ‘five’ *soy is found as 
cardinal ‘five’ in West Bahnaric, Katuic, Khmuic, Monnic and Semelaic, see Thomas 
(1976:69). *pu decimal CU and cardinal ‘ten’ in PWL is also found in Khmeric and Pearic for 
‘ten’. Many other AA languages have borrowed their cardinal ‘ten’ from Chinese via Thai or 
Tai as shown by Jenner (1976). It is renewed as a pre-positional multiplicative ‘one’ to count 
powers of ten in PWL as opposed to PWL post-positional additive mi ‘one’, as shown in $7. 

Jenner (1976), quoting Coedés (1942) shows that a former collective number system in 
Old Khmer used quaternary, quinary and vigesimal bases. Such numeration bases are used in 
PWL CU’s. Vestiges of such bases are also widely found in cardinal sets of Aslian, Old Mon, 
and Munda as shown by Thomas (1976) and Zide (1978). 

The most interesting aspect of PWL counting systems is that PWL still uses a CU system 
which appears to be a conservative AA CU system as the most common AA cardinal words 
are derived from its CU words according to their numeration bases and that traces of its main 
numeration bases are still found in many AA cardinal systems. The PWL CU system enables 
us to understand Old Khmer and Munda remaining grouping numbers and explains why AA 
cardinals are composite systems. The existence of such former AA collective notion of 
number systems was already predicted by Coedés and Jenner though they were not fully 
recovered in epigraphic documents. 

I have also pointed out here the interest of CU systems and their associated baskets and 
bundling techniques for easy computations on large numbers in oral languages. PWL 
counting units are quite elaborate and useful devices as they represent at once numbers and 
their computation in term of lower units with their graded containers for trades. The PWL 
way to count also allows perception of large space or time units and developed together with 
rich cosmogony representations. 

The interesting innovation kad for ‘teens’ in Pnaric cardinals, derived from the vigesimal 
AA CU *kur, is shared with North Munda (Santali), Khmuic, Palaungic and Bahnaric 
languages but is neither found in War nor in conservative South Munda, Monic or Aslian. 
This point matches other data presented in Daladier (2011) showing that War is not a Pnaric 
offshoot but rather a sister language of a common ancestor. War people took refuge on Pnaric 
lands in Meghalaya from Bangladesh very recently$. 

Affixation of AA -r(e)for people and -de for goods on cardinals as kind of number 
classifiers from ‘one’ to ‘nine’ is found in Lyngam and in Gta? (a South Munda language). 
Lyngam and Gta? groups might have been in contact before the separation of South and North 
Munda groups. Then Lyngam remained in contact with North Munda groups in the West of 
the gulf of Bengal, see Daladier (2011), while Pnaric groups were in contact with TB groups 
in Assam and while War groups might have been on the hills, east of the gulf of Bengal. 
Probably around the 18" century, Lyngam adopted the Pnar cardinal system when the Pnars 
and some Khasis extended their territories west of the eastern Khasi territories, see Shadap- 
Sen (1981) and Kharakor (1951). 

The widespread use of additive or subtractive expressions for ‘seven’, ‘eight’ and ‘nine’ 
shows that AA cardinals are late derived composite number systems. Those arithmetic 
expressions may become frozen prefixes and then may further be reduced or zeroed, as it also 
happens in PWL. This is why the AA name of the hand may denote cardinal values ‘five’, 
‘eight’ (five [plus three]) or ‘ten’ ([complete] hands) in different AA groups. Beyond this 
technical point, from a deeper methodological view point, the matchings of CU words on 


$ This can also be inferred from the history of War clans for several generations which are traced back and 
recited during the Jum /?jar secondary death rituals (see Daladier 2012). 
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cardinal words in AA are interesting as they are not true isoglosses but reflect long term AA 
cultural changes inside an area with Hindu and Sinitic influences. 

Baruah (1985:35) mentions Chinese commercial and diplomatic contacts with early Hindu 
kingdoms in Assam in the second century B.C., from Chinese sources. Baruah (1985:71-110) 
further describes early Indian kingdoms and contacts with Chinese travellers. 

A few centuries later, the Chinese Fu-nan Hinduized kingdoms have developed along a 
maritime South-Eastern route. Coédes (1989:61) describes Chinese extended contacts with 
the Khmers, the Mons and Indonesia especially. Coédes describes Fu-nan kingdoms and their 
relations to Indonesia before they were replaced by Angkorian kingdoms, from the first 
century A.D. up to the fourth century A.D. 

The map of Coédes (1989:497) showing in detail the Chinese maritime route in 
connection with Hinduized kingdoms should be compared with the AA map of Luce (1985: 
vol. 2, 135) with a Northern interior route joining Hanoi (Tong King) to Assam. Those two 
maps should in turn be compared with the Chinese map of Nan Chao reprinted in Luce (1985: 
vol.2, 136) which shows in details North and South Eastern Asia in 8*-9* century A.D. and 
northern roads linking An-nan in T'ang China to the Brahmaputra river in Assam and the Pyu 
capital of a Northern Mon kingdom. 

Those trade routes might account for some AA cardinal words connections and also for 
some common borrowings from Chinese cardinals, eventually via Thai and Tai. However, 
contacts with Sinitic and Hinduized kingdoms in different trade routes should not be confused 
with early migrations of non sedantarized AA groups. It seems unlikely that PWL, Khmeric 
and Pearic groups have been in contact and their common use of *pu ‘ten’, not shared in other 
AA groups, might be a conservative AA vestige of a previous use of *pu as decimal CU. 

The fact that the PWL vigesimal CU *kur, widespread as cardinal twenty in Munda, is 
also found as ‘twenty’ or ‘group of twenty’ in Sanskrit, and probably borrowed in PTB, seems 
to be indicative of a relatively early lower Brahmaputra contact area between an ancestor of 
PWL, Munda and TB groups, before the first Hindu kingdoms settled there and perhaps 
before Munda groups separated into northern, Kherwar, and southern groups. 

Historical Hindu-Sinitic contact information from Coédes and geographical matching of 
PWL CU words and AA cardinal words, among other linguistic data, favor an hypothesis of 
quick dispersion of AA groups south-East and South-West along several rivers including the 
Brahmaputra. The ancestor of PWL probably belonged to a lower Brahmaputra area in 
contact with Munda groups before the first hindu-sinitic contacts around the second century 
B.C. 

To sum up these concluding remarks, my comparison of cardinal sets in AA makes sense 
in the light of Coédes (1989) history of the genesis of Angkorian and Mon kingdoms after the 
disappearance of a Chinese Fu-nan kingdom, all of them using maritime and inland trade 
routes; it also gives hints to Jenner's (1976:58-59) conclusion that the Khmer system of 
decimal cardinal numbers is a composite system settled under the influence of trades with 
Indian and Chinese merchants during the Fu-nan kingdom". A related point is the diffusion in 
AA of some Chinese cardinals later via Thai and Tai influences. 

Finally, the conservative AA character of the PWL CU system adds some new arguments 
about AA south-west and south-east dispersion along different rivers including the 
Brahmaputra, as already argued on other features by van Driem (2001:289-94). 


7 The Fu-nan kingdom was established by Chinese traders during the first century A.D. and lasted till the end of 
the seventh century. They met pre-Angkorian Khmers and Mons from Menam, see chapters 4, 5, 6, 7 in Coédes 
(1989) for historical data on the spread of Eastern Hinduized kingdoms, from Burma to Vietnam and Indonesia. 
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1. Introduction 


In the vast majority of the Eastern and Western Tibeto-Burman languages the finite verb is 
negated by a preverbal particle *ma-, which is reconstructible for Proto-Tibeto-Burman 
(Matisoff 2003: 488). But in a substantial number of languages in and around North East 
India, we find various patterns in which negation is marked postverbally: 


The overall pattern of the position of negative morphemes in Tibeto-Burman can be 
summarized as follows. VNeg order is dominant in an area corresponding roughly to the 
section of India east and northeast of Bangladesh, including most Bodo-Garo, Tani, and Kuki- 
Chin languages, while NegV order is dominant in two areas, one to the west, in Bodic, and one 
to the east, including Nungish, Jinghpo, Northeast Tibeto-Burman, and Burmese-Lolo- 
languages. (Dryer 2008: 70) 


This paper is a preliminary exploration of the reasons for this variation. I will mostly 
restrict my attention to Kuki-Chin languages, following on observations by Konow in the 
Linguistic Survey of India (Grierson 1904) and C. Y. Singh (1992), and leave consideration of 
analogous phenomena in “Naga”, Karbi, Boro-Garo, and other languages for another time. 


2. The negative prefix *ma- 


The *ma- negative occurs across the family, and is uncontroversially reconstructed to the 
proto-language. Outside of our area it always preverbal, and almost always a prefix, though it 
is also attested as a phonologically independent particle, as in Lahu. Thus Matisoff 
reconstructs a “negative adverb” rather than a prefix. But even in Lolo-Burmese it always 
precedes the finite verb (Bradley 1979: 372) and in every language where the phonological 
structure of the language allows, it is prosodically dependent on the following verb. 
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2.1. *ma- outside of North East India 


Tibetic languages provide useful examples which illustrate the basic TB negative construction 
and typical secondary developments. In Classical Tibetan the negative particle ma! always 
directly precedes the finite verb: 


(1) ngas ma bor ro 
LERG NEG throw FINAL 
‘I didn't throw [it]! 


The Tibetan writing system has no means of indicating phonological dependence of one 
morpheme on another, but in all contemporary Tibetic languages the negative form is attached 
as a prefix to the verb, and presumably this was also the case in older forms of the language. 

Two other facts about Tibetan negation provide a model which can explain some of the 
developments in negative marking in Kuki-Chin and elsewhere. First, note that, as in other 
verb-final languages, negation typically attaches to the final verb in a sequence, so that in 
constructions with auxiliary verbs, the negative prefix attaches to the auxiliary rather than the 
lexical verb. Over time, as auxiliary constructions become more grammaticalized, this can 
result in constructions in which the negative marker occurs between the verb stem and a TAM 
suffix, as in Lhasa Tibetan: 


(2) phyin-song 
went-PERF 
*[S/he] went.’ 


(3) phyin | ma-song 
went NEG-PERF 
*[S/he] didn't go.’ 


In Dryer's (2013) scheme this constitutes a shift from pre- to postverbal negation, but for 
our present purposes it does not count as such, because the negative morpheme cannot occur 
as the only verbal operator following the verb, but must be followed by a TAM element on 
which it was originally a prefix. 

The second phenomenon of interest in Tibetan is the special negated forms of the copulas. 
The equational and existential copulas yin and yod have irregular negative forms min and 
med. In the modern languages many tense/aspect forms are based on these copulas. Since a 
copula auxiliary is the final verbal element in a clause, it carries negation, resulting in 
negative constructions which could be synchronically analyzed as having tensed postverbal 
negative forms, as in Lhasa Tibetan: 


(4) nga zos=gi yod 
I eat=IMPF EXIST.PERSONAL 
‘I am eating? 


(5) nga zos=gi med 
I eat=IMPF EXIST.PERSONAL.NEGATIVE 
‘I am not eating’ 


! There are two forms of the negative particle in Classical Tibetan: ma with perfective and imperative stems, and 
mi with present and future stems. Only the first is relevant to our present concerns. 
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We will see forms in NEI languages which appear to have a similar origin. 
2.2. *ma- in North East India 


The original negative construction occurs in some languages of NEI. Mongsen Ao shows 
consistent prefixing of the negative ma-, even in the presence of TAM suffixes (Coupe 2008: 
292): 


(6) mà-tfhuwa-34 
NEG-emerge-PRES 
‘doesn’t return’ 


(7) ma-phur-i-uP 
NEG-steal-IRR-DEC 
*won't steal’ 


Mongsen Ao also has a negative suffix, which co-occurs with the prefix, only in past 
tense: 


(8) ma-tsapha-la 
NEG-fear-NEG.PST 
‘did not fear’ 


Ao is conservative among its neighbors; many other “Naga” languages have innovative 
negative constructions. But in some of these languages we find frozen forms which attest to 
the earlier use of the older construction. For example, Sumi (Amos Teo, personal 
communication) regularly negates the finite verb, either stem or auxiliary, with postverbal 
-mo (see Section 3.2): 


(9  pa-je  à-lé p'óà-mo 
3SG=TOP NRL-song sing-NEG 
‘He will not sing.’ 


But the preverbal negative is preserved in fossil form in mtha ‘not know’ (ithi ‘know’) 
and composite postverbal operators -mphi ‘not yet’ (aphi PROGRESSIVE) and -mla ‘unable to 
do’ (-/u ‘able to do’). 

In many Northwest Kuki-Chin languages we find a reflex of *ma- occurring as part of the 
string of verbal suffixes, always preceding some other TAM morpheme. An example is Anal 
(Sengoi Singh 1992). Like other NW KC languages, Anal uses the prefixal indexation 
paradigm in affirmative clauses, and the postverbal paradigm in negative clauses (DeLancey 
2013a, b). Clearly the negative construction originated as *ma- prefixed to a postverbal 
auxiliary ni: 


(10) ni ka-ca-wa 


I 1SG-eat-TNS 
‘T eat.’ 
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(11) ni ca-ma-ni-y 
I eat-NEG-TNS-1SG 
‘I do not eat.’ 


The only reported example that I know of of an apparent reflex of *ma occurring 
preverbally is Daai Chin am (So-Hartmann 2009: 252-254): 


(12) ah khhyu:=noh ta 
3SG.POSS wife=ERG FOC 
am ` dang-yah mjoh 


NEG  SuSpect-NON.FUT EVID 
‘His wife did not suspect, it is told.’ 


While the resemblance to the pan-TB negative prefix is obvious, both the form (am rather 
than ma) and the phonological independence of this form from the verb stem remain to be 
accounted for. 


3. Postverbal Zmak? 


The most widespread postverbal negative construction is clearly related to PTB *ma-. This is 
a postverbal, often phonologically independent syllable mak or ma?, which appears to have 
the same kind of origin as the Tibetan min and med forms described in Section 2.1. Some 
languages are reported as having postverbal ma, with no final consonant. In some cases this 
may simply be a case of failing to transcribe a final glottal stop, or it could represent further 
phonological erosion mak > ma? > ma. These forms occur in a number of languages in 
Northern Naga? and the languages formerly lumped together as *Naga", including Liangmai, 
Maram, Maring, and Zeme (Marrison 1967: 127). Within Kuki-Chin they are reported only in 
the Northwestern subbranch; in the next section we will see some examples. 


3.1. #mak and its origin 


The Linguistic Survey of India reports some form of mak as a postverbal negator in all of the 
“Old Kuki”, i.e. Northwestern KC, languages. In modern descriptions it appears that these 
forms are found only in non-future tenses. For example, Koireng (C. Y. Singh 2010:114-5), 
and Moyon (Kongkham 2010) have distinct negative morphemes in the realized/non-future 
and unrealized/future tense. The realized negative is -mak-; the unrealized negative represents 
another form which I will discuss in Section 4.3: 


? I adopt Bauman’s (1975) practice of using # to indicate a comparative set which has not yet been systematically 
reconstructed. 

7 [n the Northern Naga languages the final /k/ is a 1SG agreement marker, but in the other “Naga” and KC 
languages it must have a different origin. 
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Table 1: Koireng negative paradigms 


Realized negative | Unrealized negative 
IsG | Xmok-ig 2-no-ni-9 
IPL | Xmok-un 2-no-ma-ni 
2sG | Xmok-ci 2 no-ti-ni* 
2PL | Xmok-ci-u 2-no-ti-ni-u 
3sG | Xmok-e 2-no-ni 
JPL | Xmok-u 2-no-ni-u 


The fact that the negative forms -mak- and -no- are followed, and in the 2™ person 
unrealized forms, preceded, by agreement morphemes argues that they originated as auxiliary 
verbs. This is probably also the history of no, which we will return to below (4.3). But mak is 
more complex; it appears to have the same kind of origin as Tibetan min and med, that is, the 
coalescence of what was originally a copula with a negative prefix. 

This hypothesis concerning the origin of mak is not original; already in the Linguistic 
Survey of India Konow had discerned the relation between our postverbal mak forms and the 
general Tibeto-Burman preverbal *ma-: 


It is ... probable that mak is a compound, consisting of the negative prefix ma and a verb 
substantive ... On the whole it may safely be assumed that the negative suffixes in the Kuki- 
Chin languages contain a negative prefix which is not, however, prefixed to the principal verb 
but to the old copula which is added as an assertive suffix. The negative verb would, 
accordingly, be a compound. The negative particle is usually inserted between the root and the 
tense suffixes, a fact which well agrees with the supposition of its being a verb forming a 
compound. (Grierson 1904: 19, emphasis original) 


There are independent arguments for a copula #yak or #yik at the root of the Northern and 
Northwestern KC “agreement words” (DeLancey to appear), so we might propose that mak « 
*ma-yak. But we also have to reconstruct a negative copula Zkay for PKC and even earlier 
(Section 4.2), so an original source *ma-kay is also plausible. 


3.2. Open syllable /m-/ forms 


In a few Northern and Northwestern KC languages there is a postverbal #ma form without the 
final A 


Lamkhang (Thounaojam and Chelliah 2007: 63) 
(13)  a-cak-ma 
2-eat-NEG 
‘Don’t eat!’ 
Thadou (Haokip 2012: 12) 
(14) ka dam mga ée 


I well NEG DECL 
‘I am not well.’ 


^ The Koireng Grammar has a misprint in example 24, p. 114: -niti should be -tini. The correct form is given in 
the text above on p. 114. 
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These could be explained away as reflecting phonological reduction of older mak, but it 1s 
not obvious that this is the only possible explanation for these forms. This question cannot be 
resolved without more detailed information on other Northwestern KC languages. 


4. Other postverbal negative forms 


There are three well-attested postverbal negative forms besides mak, probably reconstructable 
as *law, #kay, and *no. For the first two there is evidence outside of their current status to 
show that they were originally negative copulas. 


4.1. *law 


VanBik (2009: 253, #1035) reconstructs a negative morpheme */aw for PKC, based on Haka 
Lai Aën, Fallam Lai /àw, Mizo lo, Thadou Jon (Haokip 2012); see also Bawm Chin lo 
(Reichle 1981), Sukte -/aw (C. Y. Singh, personal communication). Outside of KC proper, 
note Meithei £e (non-future) / loy (future) (C. Y. Singh 2000: 144-5). More recently VanBik 
(2013) suggests that this originates in */awi, *law?2 ‘disappear/lose’ (2009: 249, 441011)? 
Examples of this form in Kuki-Chin are: 


Mizo (Chhangte 1993:92) 


(15)  kán-thü-to?-low 
I SPL-sit-PPF-NEG 
‘We are not sitting anymore.’ 


Hakha Lai (Peterson 1998) 

(16) law 2a-ka-thlo?-piak-law 
field 3SG.SU-1SG.OB-hoe2-BEN-NEG 
"He didn’t hoe the field for me.’ 

Falam (Yu 2007, cited in VanBik 2013) 

(17) a-thaj diy a-si-law 


3sG-know IRR 3SG-be-NEG 
‘S/he shouldn’t find out.’ 


Thadou (Haokip 2012) 
(18) ipii ná bóol low ham 
what | you doz NEG INTERR 


‘What is it that you do not do?’ 


5 Verbs in Kuki-Chin languages typically have two distinct stem forms, generally referred to as Stem 1 and 2; 
this distinction is indicated here and in the glosses to the examples by subscript numerals. 
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4.2. #kay 


Another negative form in Kuki-Chin is #kay; an example is the postverbal negator -kay in 
Sukte (C. Y. Singh, personal communication): 


(19) ken an ne-kay-in 
I rice eat-NEG-1SG 
‘I do not eat rice.’ 


We will see additional examples in Section 4.4. The origin of this form is not clear, but 
the forms bear a striking resemblance to the Bodo-Garo negative existential copula gwi. Both 
may be related to Jinghpaw koi ‘avoid, shun’ (Hanson 1906:242) or ké ‘scarce’ (Hanson 
1906:232). 

Mindat (Southern Chin) has a preverbal negator which looks as though it could be related 
to this form (Jordan 1969: 54-55): 


(20) kal kah law khai 
NEG ISG come FUTURE 
‘I will not come.’ 


(21) käh ei pha ci 
NEG eat PERF FIN 
‘[He] has not eaten.’ 


4.3. *no 


One more negative formative which occurs in several languages is no. We have already seen 
this in the unrealized paradigm in Koireng (Section 3.1. Another example, also from NW KC, 
is Chhothe (Jayalata Devi 1992), where negative -no is suffixed to the verb, preceding other 
suffixes: 


(22) neg dan-ne 
you know-TNS 
“You know it.’ 


(23) nay dan-no-e 
you know-NEG-TNS 
“You don’t know it.’ 


(24) kay bu bak-in 
I rice eat-1SG 
‘I eat rice.’ 


(25) kay bu bak-no-y-e 
I rice eat-NEG-1SG-INTERR 


‘I do not eat rice.’ 


In example (25) we see that negative -no- takes the 1* person agreement suffix, providing 
evidence for its verbal origin. In Hmar it occurs as an uninflected particle with a verb, but 
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conjugated with the characteristic KC prefixes as a negative copula (Baruah and Bapui 1996: 


133-135): 

(26) ka-fe: a nih 
lsG-go ASP FUTURE 
‘I am going.’ 

(27)  ká-fe: no: ni-y 


Q8) 


Q9) 


lSG-go NEG FUTURE-1SG 
‘I do not eat rice.’ 


zirti:rtù ká-nih 
teacher 1-COP 
‘IT am a teacher.’ 


zirti:rtu. kan-noh 
teacher 1-NEG 
‘I am not a teacher.’ 


4.4. Languages with two postverbal negatives 


Many Kuki-Chin languages use more than one of these negative constructions. We have 
already seen the occurrence of both *mak and *no in Koireng and Moyon (Section 3.1). In 
Paite (Northern KC) we see both */aw and #kay (C. Y. Singh 1992, N. S. Singh 2006): 


The declarative negation is also formed by adding the suffix ‘key’ to (1) a verb, (2) a be-verb, 
and (3) a copula verb. But in the case of sentences containing the verb ‘hi’ [the copula], the 
declarative negation may also be formed by adding ‘law’ to the main verb also. (N. Singh 
2006: 152) 


That is, a finite verb is negated by sentence-final key (N. S. Singh 2006: 152): 


(30) 


(31) 


ama? —a-kap 
he 3-weep 
‘He weeps.’ 


ama? a-kap-key 
he 3-weep-NEG 
"He doesn’t weep.’ 


The copula Az used as an auxiliary may be negated the same way: 


(32) 


(33) 
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ama? kóp a-hi 
he weep .3-COP 
‘He weeps.’ 


ómá? kóp a-hi-key 
he weep ` 3-COP-NEG 
‘He doesn’t weep.’ 
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But alternatively the lexical verb may be negated instead, in which case the negative form 
is law: 


(34) má? kdp-law  ə-hí 
he weep-NEG 3-COP 
‘He doesn’t weep.’ 


5. Conclusion 


Postverbal negative constructions in Kuki-Chin have arisen through two broad pathways: 
grammaticalization of an auxiliary negated with the PTB *ma- prefix, and grammaticalization 
of some other serialized verb, either an inherently negative copula or another verb with 
inherently negative meaning. In a brief survey we have seen each of these pathways in 
different languages involving different specific source constructions. First, we can conclude 
from the multiplicity of negative constructions with a single branch that this represents a 
fairly recent process; if the development of a new postverbal negative construction had been 
completed by Proto-Kuki-Chin we would expect the daughter languages to share the same 
construction. Second, the fact that all the KC languages have innovated postverbal negation, 
and through several different paths, suggests a consistent tendency toward postverbal negation 
in these languages. We could imagine a typological tendency, as post-head operators are 
considered to be characteristic of SOV languages, or an areal phenomenon, given that there is 
less evidence for such a tendency in other TB languages. 
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Abstract Apatani is a Tibeto-Burman language spoken in northeastern India. Apatani was classified as Western 
Tani by Sun (1993), and this phylogenetic assignation was recognized by van Driem (2001), Burling 
(2003) and Matisoff (2003). The systematic analysis of Sun’s 25 lexical isoglosses in order to classify the 
Tani languages has been extended to a list of almost 300 roots in this paper, which are given in the 
appendix. The source of data is the lexicon in the appendix of Post and Tage (2013: 48-75). On the one 
hand, one focus is devoted to regular sound changes from Proto-Tani (PT) to Apatani (Apt). Sun set up a 
list of sound changes which are reviewed; and, additionally, some updates are proposed on the basis of 
the above mentioned new data. A special focus is devoted to the deletion Proto-Tani rhyme velar nasals 
and the subsequent compensatory lengthening of the preceding vowels in Apatani (e.g. PT *ray > Apt raa 
‘empty’), and to the syllable onset cluster simplification (e.g. PT *kri > Apt xi ‘count’). Also, the so- 
called “underspecified nasal" and its development from PT rhymes are presented (Post 2013: 26). On the 
other hand, an extensive analysis of the almost 300 isoglosses is presented. For instance, unique forms 


with no cognates in any other Tani languages (e.g. Apt ‘u-dé ‘house’ (PT *nam and PTB *kyim)) and 
some revised PT reconstructions on the basis of Apatani (e.g. PT *puy instead of PT *di ‘mountain’ for 
Apt puu) are discussed. Also, Proto-Tibeto-Burman reconstructions (PTB) play an important role as 
certain Apatani forms are closer to the PTB reconstructions than to the reconstructions of PT (e.g. PTB 
*sya > Apt yo ‘meat’ (PT *dir)). The results are twofold as they (i) point out some Apatani phonological 
non-correspondences which suggest proto-variation and (ii) unique forms are found which could indicate 
a now extinct phylum or a substrate situation. 
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1. Introduction 


Apatani (henceforth Apt) has been classified as an early-branching Western Tani (henceforth 
WT) language by Sun (1993). This classification and tree has been adapted in Burling (2003). 
Figure 1 shows the Tani languages family tree posited by Sun (1993: 297). The classification 
of Apatani has been recognized by scholars (van Driem 2001, Matisoff 2003, Post 2006a). 
This paper systematically analyzes nearly 300 Apatani roots in regard to Proto-Tani 
(henceforth PT) reconstructions, which is, to the knowledge of the author, the first of its kind. 
All PT reconstructions are taken from Sun’s work. After Post and Tage’s (2013) paper the 
Apatani data has received a new understanding, mostly in phonological terms. This allows us 
to critically view and review the sound changes that had been posited by Sun (1993), which 
were based on older data (Simon 1972, Abraham 1985 and 1987, and Weidert 1987). Still, a 
more extensive lexical-comparative study has not been conducted. This article has two main 
aims. On the one hand, in $2 we will systematically review Sun's classification criteria in 
regard to Apatani and subsequently revise some of the sound changes proposed by Sun (1993) 
in $3. This is possible due to both the availability of new data (and therefore a new 
understanding of data) and the extensive analyses that have been lately conducted on Apatani 
phonology and lexicon (Post and Tage 2013). On the other hand, in $4 the inclusion of 300- 
odd lexemes will give additional information on the status of Apatani within the Tani 
languages. As a result, the proposed sound change PT *z- > Apt j-! is discarded in favor of PT 


! Note that in Tani studies, nowadays y is used for the palatal semivowel, however in Sun (1993), j occurs 
instead. 
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*z- > Apt h-. Then, some other PT > Apt sound changes will be given with some minor 
revision proposals to the sound changes postulated by Sun (1993). These revisions include PT 
*kr- > Apt x- and PT *-uy/-ay > Apt -uu/-aa?. On the lexical side, non-corresponding forms 
and unique Apatani forms will be pointed out. Non-corresponding forms are interesting 
insofar as they directly lead to the question of their origins. To a promising extent, unique 
forms do this, too. It is not the first time that a Tani language is critically considered in respect 
to its phylogenetic position within the Tani language family. In the past, Milang, another Tani 
language, has received attention by scholars (most notably by Post and Modi 2011) and it was 
proposed to shift Milang to a Pre-Proto-Tani level due to the characteristics of its 
morphophonology. Maybe a similar conclusion can be drawn from Apatani. Apatani shows 
that a simple descent from PT is not accounting for both unique forms and phonological non- 
correspondences. This is not the first time it is suggested that Apatani is relatively special in 
the Tani context. Sun (1993: 295) described Apatani as relatively “aberrant”; Post and Tage 
(2013: 18) listed a number of mainly grammatical features that are difficult to explain. 
Apatani was therefore classified as early-branching WT language (cf. Figure 1). In that tree, 
Minyong is not given and Gallong refers to the Galo varieties. Thus, it is plausible to either 
suggest a variation on the proto-language level or, similarly to Milang, to shift Apatani to a 
Pre-Proto-Tani level. Last, Post and Tage (2013: 18) also suggest that Apatani “may 
incorporate features of a substrate of unknown phylogenetic status". While we agree with this 
suggestion, we do not aim to radically claim that either of the just mentioned explanations 
may account for Apatani in general. Neither do we propose a significant shift of Apatani 
within the Tani stammbaum. As it is often the case in historical linguistics, both statements 
above are rather vague but still they should be considered as a possible tentative and 
preliminary explanation attempt on the basis of the available data sets. Rather, it is intended 
here to (a) point out the importance of the revision proposals given below and (b) to show the 
non-correspondences, which attracted attention during the study and which need to be 
accounted for and to be at least addressed in a descriptive manner. 


Proto-Tani 


Western Tani Eastern Tani 


Pa Milang ? 
Apatani 
Damu? Bon 


okar? 
Bokar Mising Padam 


Nyisu Bengni Nishing Tagin Yano Hill Miri Gallong ? 


Figure 1 - Provisional Tani family tree (Sun 1993) 


? Note that long vowels are represented by double vowels as above instead of the IPA symbol for long vowels. 
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1.1. Tani 
1.1.1. Apatani 


This section shall briefly give a typological contextualization of Apatani, which is spoken by 
28,400 speakers (2001 census, Government of India, in Lewis et. al. 2014). There are seven 
vowels in Apatani and there is a distinction between high and low tone (compare ku- 
‘maternal uncle’ vs. ku- ‘dove, pigeon’). Furthermore, Apatani words (that is a phonological 
word that is utterable by speakers) usually consist of two bound morphemes. For instance, the 
word for ‘bone’ is "a-/óo, where the first component, a prefix, is found on a large amount of 
nouns and adjectives in combination with basic entities, such as kinship terms, body parts etc. 
in Tani languages (Post 2006a: 49). However, the second morpheme is not able to be uttered 
as an independent word by Apatani speakers. For conventional reasons, the high central 
vowels are represented as i in Apatani (Post and Tage 2013). Note that w has been used as i as 
well in the past by authors (e.g. in Sun (1993)). The next section reviews Sun's classification 
criteria. 


2. Review of Sun's classification criteria 


Sun (1993) developed his Tani subgrouping scheme on the basis of four phonological features 
(Sun 1993: 231) and 25 lexical isoglosses (Sun 1993: 234—258, cf. Table 1). The sound 
changes apply for Proto-Tani > Tani. The four phonological innovations, which are defined in 
the respective subsections, include velar palatalization (82.1), labial palatalization (82.2), 
liquid cluster simplification (82.3) and final stop erosion (82.4). The names of these processes 
are directly adapted from Sun's work. The following languages are compared in this paper: 
Apatani (WT), Lare Galo (WT), Upper Minyong (ET) and Milang (ET). The source of data 
for Galo (also Adi and Gallong), Upper Minyong and Milang is Post and Modi (2011: 215- 
258). The four innovations are now compared. 


2.1. Velar palatalization 
Velar consonants are palatalized before front vowels (e and i) in most WT languages but 


generally not in ET and Milang. This feature is given for both PT and the four Tani languages 
in Table 1. 


Table 1 — Velar palatalization in four Tani languages 


Language | Form | Gloss Language Form | Gloss 
Apatani "a-ci 
E : Lare Galo ci- E ; 
«Lc ‘ L ‘ E 
PT ki ill, in pain Upper Minyons ki ill, in pain 
Milang a-ki 


Table 1 exemplifies that * is palatalized in Apatani and Lare Galo, but generally not in 
Eastern Tani. There are languages in the Eastern Tani branch where the palatalization 
innovation arises, such as Damu? la?-ci ‘left’ (cf. Padam lak-ke), but in Western Tani the 
innovation arises without exception. The data set is limited to five roots. Besides ‘ill, in pain’ 


? Source: Sun, Jackson Tianshin. 1993. Tani synonym sets. (unpublished ms. contributed to STEDT). Accessed 
via STEDT database <http://stedt.berkeley.edu/search/> on 2015-10-12. 
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(*ki-), there are the following PT roots: ‘crab’ (*ke), ‘left hand’ (*ke), ‘marrow’ (*kin) and 
‘boil water’ (*kil). Their Apatani reflex is a palatalized onset and in the case of ‘marrow’, a 
diachronic process becomes visible. Apatani cuy is the result of the earlier mentioned 
palatalization plus a change in vowel quality from *ciy to cuy ‘marrow’. The palatalization is 
always visible in Lare Galo ci ‘crab’, ci ‘left hand’ and cir ‘boil water’; and where the data 
exist, no palatalization exists in Upper Minyong ke ‘crab’ and kin ‘marrow’. In Milang no 
palatalization occurs in ke ‘left hand’. 


2.2. Labial palatalization 


Labial consonants are palatalized in most WT languages except Apatani, but generally not in 
ET and Milang. This is exemplified in Table 2. 


Table 2 — Labial palatalization in four Tani languages 


Language | Form | Gloss Language Form | Gloss 
Apatani -mí? 
NN | oryan Lare Galo pik |., 
PT a-mik | ‘PFX-eye Upper Minyong | -mik eye 
Milang -mik 


Only in Galo there is a labial consonant palatalization; Apatani retains PT *m. Here, 
Apatani clearly has a different status than the other WT languages. Examples where this is 
visible include Apt mi-/o ‘husband’ < PT *mi-lo (Galo pi-loo), Apt mi-yo ‘rich’? < PT *mi- 
ta/*mi-ran (Galo pi-t2), Apt mi ‘sister (elder) < PT *me (Galo oni, Apt mi ‘tail’ < *me 
(Galo jie) etc. 


2.3. Liquid cluster simplification 


Liquid cluster simplification, that is the PT *ry- development from PT in the Tani languages, 
is not a straightforward feature. Sun observed a simplification from PT *ry- to j-^ in Bokar, 
Mising and Padam, which he subsequently used as subgrouping criterion (Sun 1993: 237). On 
the one hand, there is a de-rhoticization process in Apatani (*ry- > /y-) and on the other hand 
there is a simplification process in ET languages (*ry- > y- in Upper Minyong and Milang). 
Galo exhibits its own sound change: PT *ry- > Galo r-. This is shown in Table 3. The liquid 
cluster simplification in Tani languages is discussed in more detail in Post and Modi (2011: 
15). Figure 2 exemplifies the respective outcomes of the sound changes according to Post and 
Modi (201115. Note that the WT languages Bokar (BKR) and Pugo Galo (PGG) also exhibit 
the ET simplification due to an areal diffusion (Post and Modi 2011: 230). It is noteworthy 
that Apatani is the only Tani language whose reflex of the proto onset is /y-. 


^ [n Bokar and Bengni /j/ is a palatal semivowel. 

5 Post and Modi (2011) use following abbreviations: APT for Apatani, BKR for Bokar, LRG for Lare Galo, 
MLG for Milang, NWG for North-Western Galo, PDM for Padam, PG for Proto-Galo, PGG for Pugo Galo, 
UBM for Upper Belt Minyong, UNY for Upper Belt Nyishi. Also, Post and Modi use j, we use y for the palatal 
semivowel; however in the Appendix we use the original forms with j. 


216 


14. The genetic position of Apatani within Tibeto-Burman 


Table 3 - Cluster simplification in four Tani languages 


Language | Form | Gloss | Language Form | Gloss 
Apatani 'a-lyi? 
: Lare Galo a-rak l 
Nu € , D 
Wi SE ES Upper Minyong | e-yek pig 
Milang a-yek 
PT *rj- 


PWT *rj- PET *j- 


MLG j- 
UBM j- 


Msc j- Pp j- 


Arr lj- UNY rj- PG *rj-/ BKR j- 
NwG r- 


LRG r- 


Figure 2 — Liquid cluster simplification according to Post and Modi (2011: 15) 


2.4. Final coronal stop erosion 


A very important sound change is the final coronal erosion, a WT feature. This sound 
change has been documented in Sun (1993: 189). There are less straightforward cases of this 
sound change; in a specific set of cases, the final coronal stop is not fully eroded (as shown in 
Table 4), but rather changes to a glottal stop, i.e. PT -koC > Apt -ko? ‘open’. 


Table 4 - Final coronal stop erosion 


Language | Form | Gloss | Language Form | Gloss 
Apatani ba- 

.» | Lare Galo ba- c 

* is ‘ E ‘ E 

PT brat- | ‘vomit Upper Miyak bal vomit 
Milang byot- 


2.5. 25 lexical isoglosses 


In addition to those four sound changes, Sun listed 25 lexical isoglosses (1993: 278) in 
order to classify the Tani languages. The analysis of those 25 roots clearly unveils that the 
vast majority of the forms align, as expected, with WT. The list is given in Table 5 with Proto 
Western Tani (henceforth PWT) and Proto Eastern Tani (henceforth PET) reconstructions, the 
Apatani form and the proposed alignment. All orthographic conventions have been adapted. 
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Table 5 — Sun's 25 Tani lexical items 


Gloss PWT PET Apatani (Post and | Alignment 
(Sun 1993) (Sun 1993) Tage 2013) 
‘urine’ *sum *si si? ET 
‘blind’ *mik-cin *mik-may mip-ca® WT 
‘mouth’ *oam *nap-pay gon, gun WT 
‘nose’ Zoch num *nV-buy pin WT 
‘wind’ (n.) *ryi *sar lyi WT 
‘rain’ (n.) um H don *pV-dog mi-doo WT 
‘thunder’ *don-gum *don-mir ge WT 
‘lightning’ *doy-ryak *ja-ri doo-lya? WT 
‘fish’ *yo-i *a-no yi-i WT 
‘tiger’ *nay-ta *myo/*mro | paa-ti WT 
*root? *m(y)a *pir maa WT 
*old man? *mi-kam *mi-fin 'aba/'axaa ? 
‘village’ *nam-pom *duy-luy le-ba? WT 
‘granary’ *nam-suy *kyum-suy nee-suu WT 
‘year’ *nin *tak pan WT 
‘sell’ *pruk *ko pyu? WT 
‘breath’ *sak *na sa? WT 
‘ferry/cross’ | *rap *koy boo ? 
‘arrive’ Wis *pin aa- WT 
*say/speak' | *ban/*man Zu lu ET 
‘rich’ *mi-ta/*mi-ta | *mi-rem mii-yo WT 
‘soft’ *ni-myak *ro-myak bi-lye? WT 
‘drunk’ *kyum various forms | ta-ņi ? 
‘back’ (adv.) | *-kur *lat kuu WT 
‘ten’ *Cam *ryin lyan ET 


There are, however, three forms that align better with the proposed ET reconstruction and 
differ significantly from the proposed WT reconstruction. The Apatani reflexes for ‘urine’ si? 
(PWT *sum, PET *si), ‘say’ lu- (PWT *ban ~ *man and PET */u) and ‘ten’ lyañ (PWT *cam 
and PET *ryij) are clearly closer to the respective PET reconstructions than to the PWT 
reconstruction proposal. Three words, ‘old man’, ‘ferry/cross’ and ‘drunk’ do not seem to 
align with either PWT or PET. Additionally, we find also three Apatani forms that seem to 
have cognate forms in Milang without really matching either PWT or PET reconstructions. 
These forms include ‘old man’ ‘aba (Milang a-be, PWT *mi-kam and PET *mi-zin), ‘ferry, 
cross’ -boo (Milang -bok, -bog, PWT *rap and PET *koy) and ‘drunk’ ta-yi (Milang cay-har, 
PWT *kyum and various PET reconstructions). There is, however, no attested Milang-Apatani 
relationship and therefore this argument should be taken with caution. 

To summarize, this list of 25 isoglosses surely unveils interesting facts about the 
alignment of Apatani with Tani languages; however, more evidence can be gathered by 
adding more data. Next, Section §3 examines the sound changes which need to be revised. 


$ This form is only found in Sun (1993: 256); Post (2013) did not list ‘blind’. 
7 This form is only found in Sun (1993: 266); Post (2013) did not list ‘village’. 
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3. Revision of Sun's sound changes 
3.1. Syllable onset sound changes 


There are two instances of sound changes involving syllable onsets that need to be revised. 
On the one hand, syllable onset cluster simplifications and on the other hand the glottalization 
process of onset alveolar fricatives. Sun (1993) suggested a palatalization of onset alveolar 
fricatives to palatal semi-vowels (PT *z- > Apt y-); however, it is, on the basis of the new 
understanding of data, necessary to revise in the following way: PT *z- > Apt h-. Three 
examples are given in Table 6. It is possible to revise because Post and Tage (2013) noted that 
there is an underlying intervocalic glottal deletion rule in Apatani; therefore, PT *ze ‘fruit’ > 
Apt -hi ‘fruit’, but Apt a-i *PFX-fruit and not *a-hi, whereas Sun (1993) assumed Apatani ji 
‘fruit’ and therefore proposed PT *z- > Apt j-. The underlying / was simply missed by Sun as 
he couldn't know it was there. There are however clear instances of this intervocalic glottal 
deletion, such as Apt /an-hin *hundred-three' ‘three hundred’, which is realized as Jor This 
rule applies to two further lexemes in the data set, PT *zin ‘liver, nail’, accordingly. 

On the other hand, PT onset clusters like *kr- are simplified in Apatani, however not as 
Sun (1993) suggested to Apatani x7j-, but rather to Apatani x-, as exemplified in Table 7. Post 
and Tage (2013: 30) argue that a complex cluster Crj- is not found in the Apatani data, even 
though Simon (1972) did (e.g. akhrya ‘old (person)’). In Post and Tage’s data this lexeme is 
found as 'àxáa ‘old (person). Obviously there is a major discrepancy in apparent 
pronunciation here. Post and Tage (2013: 30) state in their footnote that they are unable to 
explain this. This discrepancy may be due to dialectical variation, however this is not 
plausible since both Weidert's (1987) speaker and the one in Post and Tage are from the 
Tajang variety. Or, there is the possibility of unpredictable sound changes which occur mostly 
in compounding, as was shown in Post (2006b: 4). Even Sun (1993: 53) already noticed 
“unpredictable phonological alternations”. The status of this cluster is highly doubtful. It 
occurs in only a handful of lexemes. There is one exception to the posited rule, namely PT 
*krat > Apt gi? ‘lie down’ instead of the expected xi? form in Apatani. An explanation for this 
could be that this lexeme is a loanword and has been added to the language after the sound 
change took place. It has not been possible to determine from which language this lexeme 
would have been borrowed, though. The only similar reflex is found in Proto-Galo, where 
*get ‘lie down’ is reconstructed (Post 2006a: 109). It is noteworthy that the rhyme sound 
changes (see §3.2) in addition lead to an equal Apatani reflex xi- *porcupine/six/count' in the 
case of PT *kret- ‘porcupine’, PT *kra- ‘six’ and PT *kri- ‘count’. It is not clear why Sun 
(1993) came up with three different forms. In the case of ‘porcupine’, Sun (1993: 135) gives 
two forms: xi and xrji. In the case of ‘six’, the form xrji goes back to Weidert (1987) and the 
form xi was the form Abraham (1985) proposed (Sun 1993: 131). For ‘count’, Sun (1993: 
134) lists two forms: xrje, which is Simon's (1972) transcription, and xe, which goes back to 
Weidert (1987). 


Table 6 — PT onset glottalization in Apatani 


PT Apatani Apatani Gloss Revision 
(Sun 1993) | (Sun 1993) | (Post and Tage 2013) 

*ze ji -hi fruit " 

*zin- [- hin- liver, nail SSES 
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Table 7 — PT onset cluster simplification in Apatani 


PT Apatani Apatani Gloss Revision 
(Sun 1993) | (Sun 1993) | (Post and Tage 2013) 

*krat- xrje?- xer- kidney 

*krot- grji- gir- lie down 

*kret- xrji-, xi l porcupine skr > x- 
*kra- ri-, xi xi- SIX 

*kri- xrje-, xe count 

*krok- - xor- crow (v.) 


3.2. Syllable rhyme changes 


On the basis of phonological analyses of nasal rhymes in Apatani, potential revisions of the 
sound changes postulated in Sun (1993: 331f.) are proposed. First, Post & Tage (2013) have 
indicated that there is a so-called underspecified nasal, which is usually represented as ñ and 
occurs in syllable rhyme positions; or, more precisely, as coda (note that in Tani languages 
the syllable coda is optional) in the form of a nasalization over the nuclear vowel. Sun (1993) 
used a different transcription due to the different underlying analysis because the 
underspecified phonemes were not recognized. The Apatani underspecified nasal is a result of 
four kinds of PT rhymes (*-um, *-an, *-iy and -iņ), which developed to Apatani -ifi, -eñ, -añ 
and -an, respectively. One example word each with its PT reconstruction and Apatani reflex, 
including the revised sound change, is given in Table 8. Second, PT velar nasals in rhyme 
position will delete (when preceded by /u/, /a/ or /o/) and this results in a compensatory 
lengthening of the preceding vowel in Apatani (i.e. PT *-ur > Apt -uu, PT *-ay > Apt -aa and 
PT *-oy > Apt -oo). The consequence of this sound change is that Apatani has a phonemic 
contrast of long and short vowels. In open syllables, the vowel is long, whereas in closed 
syllables with a final stop, the vowel is short. This was neither noticed by Simon (1972), nor 
by Abraham (1985). A few examples are given in Table 9. Third, the PT *-a rhyme is not, as 
suggested by Sun (1993), rounded to Apatani -u after labial initials. Rather, the data suggest 
that the following sound change applies in any environment: PT *-a > Apatani -i. This is 
exemplified on the basis of a few examples in Table 10. The "tendency towards coda attrition 
is epitomized in Apatani, where only two PT codas remain: -? (from the original stop codas) 
and -r.” (Post and Sun in press). This tendency was not recognized by Sun (1993), therefore 
we get -ñ codas. 


Table 8 — Revised rhyme sound change 


PT (Sun 1993) | Apatani (Sun 1993) | Apatani (Post 2013) | Gloss Revision 

*rum ri rin- spider *-um > -in 
*man mé méfi- kill *-an > -eñ 
*lin là lafi- neck *-in > -añ 
“lin lay láfi- hundred | *-in > -añ 
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Table 9 - PT *-y rhyme in Apatani 


PT (Sun 1993) | Apatani (Sun 1993) | Apatani (Post 2013) | Gloss | Revision 
*run ru ruu- ear 
*duy du duu- sit Zum > -uu 
uy pu puu- white 
*tay ta taa- bird 
*kay ka kaa- look | *-ay > -aa 
*lan la laa- take 
~ lo loo- bone TL m 
Soy SO SOO play 
Table 10 - PT *-a rhyme in Apatani 
PT (Sun 1993) | Apatani (Sun 1993) | Apatani (Post 2013) | Gloss Revision 
*mok mi -mi? cloud 
*ma mu -mi fire, woman 
*la li -li leg, foot bos 
Dua ni -ni leaf 
*kra xrj -xi Six, squirrel 
*pok pi -pi? sweep 


We have proposed the revision of two onset and three rhyme sound changes. Next, in 
Section $4, we will analyze the results of the 300-odd isoglosses used in the data set. 


4. Results 


In total, 294 roots have been analyzed. They can be found in the appendix of this paper. An 
extended Swadesh list was used, namely the CALMSEA (Culturally Appropriate 
Lexicostatistical Model for SouthEast Asia) 200-word list proposed by Matisoff (2000) was 
used as basis and has been extended. Of these, 199 Apatani forms (67.7%) followed regular 
sound changes. In the following, some interesting results are discussed, such as unique 
Apatani forms in $4.1, revision proposals of some PT reconstructions in $4.2 and 
"mysterious" forms in $4.3. 


4.1. Unique Apatani forms 


Probably the most interesting discovery of this study is the occurrence of unique forms in 
Apatani. Unique forms in Apatani are forms for which no corresponding forms can be found 
in any of the data sources currently available for Tani language. Also, those Apatani forms do 
not exhibit expected reflexes of the proposed reconstructions to PT or PTB (based on Matisoff 
2003). These unique forms lead to the question of their origin. Are these forms a relic of a 
now extinct language? Another explanation could be the fact, as suggested by Post and Tage 
(2013: 3) *a substrate of unknown phylogenetic status". It is particularly interesting that some 
of the unique forms reflect core vocabulary, such as ‘house’ or ‘do’. A total of 12 forms were 
found, which account for ~4% of all forms. It can be assumed that more unique forms would 
be found if the data set was bigger, the proportion should remain the same or even slightly 
decrease, since core vocabulary has already been included here. The full list of unique forms 
is given in Table 11. 
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Table 11 - Unique Apatani forms 


Apatani | PT PTB Gloss 
| puulye | *ge *buw, *gwan | clothes 
la- *han *oray cold (water) 
mi- *yyi *day do 
sar- *put - foam 
pote? *brin *brin full 
‘ude *nam *kyim house 
-lya *oray - lean against 
sar-se *yak - millet (foxtail) 
-jii *fio-pran | - orphan 
hu?- *dan tur shake 
-ko *tan *twiy short 
-be *pam *kyam, *wal | snow 


4.2. Revision of some PT reconstructions 


Sun’s PT reconstructions were mostly based on Bokar, Padam and Mising. This has been 
pointed out already (Post and Modi 2011: 14). For instance, PT Spur ‘awake’ is a 
reconstruction clearly based on ET languages, since Galo exhibits uu ‘wake’ and Apatani huu 
‘awake’ also suggests that the PT reconstruction should rather include a velar nasal than a 
stop in coda position, e.g. PT **huy*. If we take into account Apatani data, the proposed PT 
reconstructions would in some cases be slightly different. PT proto-variation (indicated by 
brackets) due to Apatani phonological non-correspondences are therefore proposed and given 
in Table 12. More revision proposals for the PT reconstructions are given in the Appendix. 
The problematic of complex phonological processes if several languages are involved and the 
resulting proposition of proto-variation has been noted by Sun (1993: 44). Additionally, we 
find some odd Apatani forms. These forms are odd in the sense of not really matching the 
sound change that would be expected. For instance, ‘open’ is ko?- in Apatani. Accordingly, 
we would expect the reconstruction to be PT *koC, but the reconstruction is *koy, which 
would implicate the following reflex in Apatani if it followed the regular sound change: 
*koo?. Apparently there seems to be an interaction between nasal and stop rhymes in PT. A 
diachronic explanation attempt could clarify these unexpected forms. Post and Modi (2011) 
proposed a reclassification of Milang (ET) to a pre-PT stage. A similar attempt could account 
for Apatani. For instance, an interaction between nasal and stop rhymes in a phase before PT 
would mean that Apatani, Proto-WT and Proto-ET forms and reconstructions would follow 
regular sound changes. However, a reclassification attempt of Apatani to a Pre-PT stage is 
potentially difficult to explain since still a vast majority of forms still follow regular sound 
changes from PT. A proposed diachronic development for Apatani ku? ‘cucumber’ is depicted 
in Figure 3. 


* Here, ** stands for proposed revisions of reconstructions. 
? [n this case, * stands for an unattested, ungrammatical form. 
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Table 12 — Proposed revision for some PT reconstructions due to proto-variation 


Apatani | PT (Sun 1993) | PT (revised) | Gloss 

-xi *kret **kre(t) porcupine 

-ci? *ko, *ka **ko(t), *ka(t) | bitter 

-pyoo *pro **yro(y) palm (of the hands) 


Pre-Proto-Tani *kuC 
thug 
Apatani ku? PT "ban 


PWT *kuu PET *kuy 


Galo kuu Adi kuy 


Figure 3 - Proposed diachronic development of Apatani ku? ‘cucumber’ 


4.3. Mysterious forms 


There are a number of forms that require special attention. In some prominent cases the 
Apatani forms on the one hand do not match proposed PT reconstructions but on the other 
hand seem to be reflexes of proposed PTB reconstructions, thus there is the possibility of 
borrowings from neighboring non-Tani TB languages, such as Digaroan, Northern Naga, 
Nungish or Western Arunachal. Three examples are given in Table 13. A possible explanation 
to this is by setting up a high-branch split of Apatani, namely prior the chronological relative 
PT stage. 


Table 13 — Apatani, PT and PTB forms 


Apatani | PT PTB | Gloss 
daa-can | *ryok | *syam | iron 
yo(o) *din | *sya | meat 
hefi- *min | *səm | think 


There are other forms though that show cognate potential to other Tani languages but do not 
follow regular sound correspondences from PT, mainly including languages such as Bokar, 
Nishing and Bengni. For instance, Apt hi? ‘feel (transitive verb)’ (PT *han) is cognate with 
Bkr!° hup, Apt nii ‘leaf? (PT *na) is cognate with Bengni!! nii and Apt bit ‘bamboo’ (PT 
*haa) is cognate with Nishing!* bum ‘classifier for long objects’. 


10 Source ` Sun (1993: 199). 
" Source: Sun (1993: 163). 


12 Source: Das Gupta, Kamalesh. 1969. Dafla language guide. Shillong: Research Department, North-East 
Frontier Agency. Accessed via STEDT database <http://stedt.berkeley.edu/search/> on 2015-10-10. 
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5. Conclusion 


Apatani has been classified as WT language on the basis of four phonological innovations and 
25 lexemes by Sun (1993). The aim of this study was to extend the list of words in order to 
receive a better insight into the classification of Apatani and to examine to what extent the 
sound correspondences are accurate taking into account the new Apatani data that is available 
to us. On the one hand, we have established regular sound changes for PT > Apatani and on 
the basis of new data were able to propose some minor revisions to the sound changes 
proposed by Sun (1993). These included both revisions for syllable onsets and for syllable 
rhymes. The former are concerning an onset sound change (PT *z- > Apt h-) and liquid 
cluster simplification (PT *kr- > Apt x-), the latter of which including underspecified nasals 
(PT Sum > Apt -iñ etc.), vowel lengthening in open syllables due to the loss of velar nasals 
(PT *-ay > Apt -aa etc.) and the PT *-a rhyme becoming -i in Apatani. On the other hand, an 
extensive analysis of nearly 300 lexemes has unveiled interesting facts about the classification 
issue of Apatani within the Tani phylum. It was possible to find unique forms in Apatani, 
propose alternate reconstructions due to proto-variation and some mysterious forms were 
presented where we find reflexes of proposed PTB reconstructions in Apatani that do not 
match with proposed PT reconstructions and Apatani forms that are cognate to other Tani 
languages but do not seem to follow regular sound correspondences from PT, thus pointing to 
a possible spreading of terms. 


Abbreviations 


BKR: Bokar, ET: Eastern Tani, PFX: prefix, PGG: Pugo Galo, PET: Proto Eastern Tani, PT: 
Proto-Tani, PTB: Proto-Tibeto-Burman, PWT: Proto Western Tani, WT: Western Tani 
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Appendix 
# Gloss Apatani (Sun 1993) | Apatani (Post Proto-Tani Proto-Tani 
and Tage 2013) (revised) 
1 alive tur tur- *tur 
2 angry xe hàa-d& *fak 
3 ant ru ruh/ru? *ruk ~ *rup 
4 arrow pu pu *puk 
5 arrow poison | mrjo myó *mro 
6 ascend ca cda- *Cay 
7 awake v.i. hu huu- *hut **hup 
8 baby na naa *na: 
9 bamboo bije bijé RER 
10 [banana kopa, kipa kipá? *ko-pak 
11 beans pe-ru run *pe: 
12 |bearn. si-ti tin *tum 
13 `" belly krji xi(i) *kri 
14 [big kai-ni kae Tra 
15 |bird pi-ta taa *tan 
16 | bite a-si ci/si? *cam 
17 "butter ko-ci? ci? *ko/*ka: **ko(t) 
18 bladder sar-pu Sàr- *sur 
19 |blood a-ji hii *vii 
20 | blow muy mu? *mut 
2] |body wu u W 
22 _| body dirt - -ko? *kot 
23 boil water car car *kil / *kray 
24 bone lo loo *lon 
25 |bow n. li lyi *ryi 
26 | break dar ~tar tar “tir 
27 | breath sar sår- *sak- 
28 | brother elder | wa ban *bin 
29 ‘| brother nu nu *ni 
younger 
30 |buy ri ri *ra: "ra 
31 |call,cry gjo gyo? *erok 
32 |can,ableto |la laa Zon 
33 | carry on gi ba? *bak 
back 
34 | chase mo mon *mon 
35 | cheat, lie mu 7a-mu *mao 
36 (chicken ro ro? *rok 
37  |child, son ho ho *ko./ho 
38 | chin gom gonpi *Cok-pray 
39 |classifier for | ta? byar- *bor 
thin, flat 
objects (e.g. 
pieces of 
cloth) 
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# Gloss Apatani (Sun 1993) | Apatani (Post Proto-Tani Proto-Tani 
and Tage 2013) (revised) 
40 [clothes pu-lje püulyé *ge 
4] [cloud mi? mi? *mak ~ *muk 
42 |cold water |- la *han 
43 |comb n. k'rji? xí NU 
44 |come,enter |a haa *van 
45  |count k'rje xiD/xi *kri 
46 crab ci ci *ke 
47 |crazy, mad |ru ruñ *ru **rum 
48 | crooked gir ku *oar 
49 cross v.i. po boo *kon 
50 !crow(bird) | wa? a? *ak 
5] |crow v. k"rjo xo? *krok 
52 Cucumber ku ku? *kuy 
53 | cut pa pa *pa 
54 | day lo lo ~ loo *lo 
55 | dead - xi *ka 
(resultative 
verbal 
particle) 
56 (die si si *si 
57 (dig (hole) du dü *ko ~*kyo 
58 |do mu mi *ryi 
59 |dog ki ki — kii *kwi: 
60 | door lier) lye? *ryap 
61 | dove, pigeon | ku ku *ki 
62 drink ta tan *tin 
63 |drip - du *di 
64 |drunk ta tant *krum 
65 |duck je? je? "jap 
66 |eagle, hawk | mu mi ~ mu *mi 
67 | ear ru ruu *run 
68 |earthworm ` (dor dor-gi *tol ~*dol 
69 | eat di di *do 
70 |egg pu pu~ puu *pi 
7l [eight prj) i2?-ni pini *pri-fi 
72 |elbow du du *du 
73 | empty ra ráa *ran 
74 |enemy - má? *rol 
75  tescape,flee | har hági *kat 
76 evening ljin lyifi *ryum 
7] j|excrement |i- i- *e: 
78 | exit v. lin lin *len 
79 | extinguished | mi? mi? *mit 
80 |eye mi mi? *mik 
81 | face ñi pi? "mo 
82 |face, cheek  |mo mo ~moo *-mo. 
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# Gloss Apatani (Sun 1993) | Apatani (Post Proto-Tani Proto-Tani 
and Tage 2013) (revised) 

83  fall(froma |hu hü *ho 
height) 

84 | fat n. hu-lji huulyi? *fu / *jin 

85 | father a-ba ba *bo 

86 | father-in-law | a-to to *to 

87 feel v.t. he hi? *han 

88 __| finger ci ci? *lak-key 

89 fire mu mi *ma 

90 [fireplace a-go "gi *ram / *rom 

(PT *eu bor) 

9] | fireplace re? *re? *rap 
shelf 

92 ‘| first prjo pyoo *pyorn 
(adverbial 
verbal 
particle) 

93 fish ni ni N) 

94 (five 7o 7o N) 

95  |flat ljá -lyan *ryin (not 

*ryap) 

96 flea ki xi? Ei 

97 | flesh ja? yd? *yak 
(human) 

98  |float bu puu *byay 

99 | flow bi bii *bit 

100 | flower pu puu *pun 

101 | fly n. mi? tà-mi? *yin 

102 |fly v. go góo- *byar 

103 | foam hu-bju sa(r) *put 

104 |foot li Gëf *lo 

105 | four pi-lje pi! ~ pi?! ~ pi! |*pri 

106 | friend ji jin *jon- jen 

107 | frog ti ti? “tik 

108 | fruit ji hi *ze 

109 | full po-te pote? *brin 

110 | gall pir par *pi 

111 |ghost i-gi i < yu *rom 
(ancestral) 

112 |ginger ki kí *kre:? 

113 | give bi bi *bi 

114 | go i in *in 

115 | good a-ja yaa *kan-pro 

116 |good (verbal |-po pyo *_pro 
particle) 

117 | granary SU suu *sun 

118 |grandfather | to to *to 

119 | grandmother | jo yo *yo 
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# Gloss Apatani (Sun 1993) | Apatani (Post Proto-Tani Proto-Tani 
and Tage 2013) (revised) 

120 | grasshopper | ko ko(fi) *kom 

121 | guest, ni-bo nibó *myi-bo 
outsider 

122 |guts - xí *kri 

123 hair mu mu *mit 

124 |hand, arm la? lá? *lak 

125 |handspan go go? Zeon 

126 | head di din *dum 

127 | heart ha haa *han 

128 | heal - mii *la 

129 |hit (target) |da? da? *byak 

130 | hornbill pe-su péesu? *oray 

131 | horse go-ra goo-rda *ki 

132 !hot, warm grju gù- Zou ~ *eyu 

133 |house u-de "dé *ham 

134 | hundred lay lan *lin 

135 |husband mi-lo milo *mi-lo 

136 ll ci "a-ci ki 

137 |iron da-cá dàa-cáfi *ryok 

138 jump po? lyoo- *pok 

139 kidney he xé? *krat'-pyil 

140 | kill mé méfi- *man 

141 !kiss mo-cu mó(0)-cu? *pup ~*puk 

142 !knee li-bà liban *la-bin 

143 | knife nar-tu ar-tu / nár-tà | *har 

144 | language, a-gü GG *gom 
speech 

145 |laugh nar nar *pil 

146 | leaf ni nti *na 

147 | lean against |- lya *oran 

148 | leech pe pe? *pat 

149 |left(-hand) | ci ci *ke 

150 |leg li li RÄ 

151 lick lja lya? *ryak 

152 | lie down gji giá / gi? *erat / *krot 

153 |lip fia-cu a-cu *bel 

154 | liquor 0 "00 *por 

155 [listen ta ta *tat 

156 | liver hir hin *zin 

157 look ka kaa *kay 

158 | lose v.t. a-lju "Oh *nok 

159 |louse (head | k’rjé xi? *fik 
louse) 

160 | lungs ru ru *ru 

161 | machete, dao | ljo lyo? *ryok 

162 | marrow cun cun *kin WS 

163 | meat jo yo(o) *din 
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# Gloss Apatani (Sun 1993) | Apatani (Post Proto-Tani Proto-Tani 
and Tage 2013) (revised) 

164 | melt ji ji? NU 

165 |millet(fox- |sar-se sársé *yak 
tail) 

166 | monkey bi bi / bii *be: 

167 | moon lo lo *lo 

168 | morning ro ro- *ro 

169 | mortar pir pa(r) *par 

170 | mosquito ru ruu *run 

171 | mother na ~ni ni *na 

172 | mountain bu puu *di 

173 | mouth gu gon / gun Zoom 

174 | move v.i. bju byu *bri 

175 |mushroom |j? in *yin 

176 |nail hin hifi *zin 

177 |neck là lafi *lin 

178 |negator ma -má *may 

179 | nest si? si? *sup 

180 | night jo yo *yo: 

181 nine ko-a koda *kV-(n)ag 

182 | nose pi pin *ña-pum 

183 |one ko ko/kon/kun | *kon 

184 | open (verbal | ko ko? -ko 
particle) 

185_| orphan ji jii *hio-pran 

186 | otter ri rifi ram 

187 |out(verbal |- - lifi len 
particle) 

188 | palm pjo pyoo pro 

189 | palm (of p'rjo là?pyóo (kê) *lak-pro **nron 
hand) 

190 | pangolin pi pi? *pit 

191 | penis mja mya(?) *mrak 

192 |pig lji lyi? *ryek 

193 |place - moo *mon 

194 |plant v.t. - ni *di: / *din 

195 |play sO sóo- *son 

196 | porcupine xrji xi *kret 

197 |pot (generic) | ca can *pV-kig 

198 |pour ti tii *lik 

199 |powder mi? mi? *mik 

200 |prohibitive |jo yo *yo 
marker 

201 |punch ki ki *kit 
(downward) 
with fist 

202 | put - li? RR 
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# Gloss Apatani (Sun 1993) | Apatani (Post Proto-Tani Proto-Tani 
and Tage 2013) (revised) 

203 | quiver (for | ge ge? *eat 
arrows) 

204 |rain n. do doo *dor 

205 red can can / lan *lin 

206 | reflexive SU su *-su 
marker 

207 |rice (cooked) | pi pin *pim 

208 |rich yo yo *mi-ta / *mi- 

ram 

209 | right-side bi bi? *brik 

210 |ripe mi mifi *min 

211 river le le *si, *buy 

212 |road len ~ lem lefi *lam 

213 |roast bja byaa *bran 

214 |root ma maa *pir, *mya 

215 |rot ja yaa *yay 

216 | rot/rotten ja yaa *yay 

217 |run har har *zuk / *duk 

218 | salt lo lo Zo 

219 |say,speak  |/u lu *ban~ man 

220 |scoop/ladle |tu tu *suk~ fuk 
Y. 

221 [scratch (with |- be? Toun 
claws) 

222 |seed - mo NI 

223 |seedling - di *Cul 

224 | sell prju() pyu? *pruk 

225 | seven kanu kanu *kV-nit 

226 |sew - rii *hom 

227 |shake - hu? *dan 

228 |sharp red re? / ro? *rat 

229 |shoot v. ed ed Zon 

230 |short di ko *ton- dan 

231 |shoulder gor gor *oor 

232 |sister (elder) | mi mi *me 

233 sit du duu *duy 

234 | six ri xi *kra 

235 [skin n. ljo lyo *ryo 

236 |sleep mi mi *yup 

237 |slip v. - le? *lut 

238 |smallpox bü bun *bum 

239 |smoke n. ku ku *ki 

240 | snail no no *no 

241 |snake bu bu *bi 

242 |snot no? no? *nop 

243 | snow pi be *pam 

244 |soft - lye? *myak 
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# Gloss Apatani (Sun 1993) | Apatani (Post Proto-Tani Proto-Tani 
and Tage 2013) (revised) 

245 |son-in-law — | ma-bo uu / ma? *mak 

246 |sound du du *dut 

247 |sour xru xu(u) / hi? *krun 

248 |spider ri riñ / rii *rum 

249 |squirrel xrji xi *krə 
(generic) 

250 |stab ni? ni? *nik 

251 |stand da da? *dak 

252 [steal p'rjo pyoo *pyon 

253 |stone là lafi *lin 

254 |sun da da THI 

255 |swallow v.  |ni (Weider 1987) ni? *met 

256 | sweep pi? pi? *pok 

257 |sweet ti? ti? "ps 

258 |swell - pyan *brin 

259 | tail mi mi *me 

260 | take la laa *lan 

261 | takin bi byu? *bren 
(Budorocas 
taxicolor) 

262 | tall, high ho hoo *hot 

263 | ten lia lyan *ryin 

264 |tens (e.g. - xan *cam 
twenty) 

265 |thin(book) |- boo-lyoo *bV-Cor 

266 | think he hen *min 

267 | three hi hin Zum 

268 throw,cast | vir gyu *vor? 

269 thunder gé ge *oum 

270 | tiger pat páati *myo 

271 |tongue ljo lyo *ryo 

272 | tooth hi hii *fi: 

273 torch - ru *ru 

274 |two ne ni / ni? *üi 

275 uncle - áaté *pan 
(paternal) 

276 |urine si? si? Zei / *sum 

277 | vomit ba ba *b(r)at 

278 | vulva, tu tu *t 
vagina 

279 | wait for lja lyaa *ryan 

280 |water si si / si Zeg 

281 | white pu puu *pun 

282 |wife mi myi(i *mi-fVr? 

283 | wild cat so so *so 

284 | wind n. lji lyi *ryi 

285 |wing le le? *lap 
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# Gloss Apatani (Sun 1993) | Apatani (Post Proto-Tani Proto-Tani 
and Tage 2013) (revised) 

286 | wipe ti? pu? “tit 

287 |wither, dry |se sen *san 

288 | woman ñi mi *myi-ma: 

289 | wood sa san *sin 

290 | worm, insect | dor dar / dor *pum **dol 

291 | wrist nir yar *yar 

292 | write ke kee *fat 

293 | year na nañ *nin 

294 | yeast po? po? *pop 
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hypotheseses for the phylogenetic affiliation of Puroik are evaluated. 
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1. Introduction 


Puroik (previously Sulung) is the official name of a tribe and a group of languages and 
dialects spoken in five districts of Arunachal Pradesh. Previous studies on Puroik all 
described an eastern variety of Puroik (Deun 1983; Tayeng 1990; Stn et al. 1991; Li 2004; 
Remsangpuia 2008; Soja 2009). There is no mention in the literature of the not mutually 
intelligible western dialects to be discussed here. 

Puroik is generally regarded as aberrant, but in some way relatable to the Tibeto-Burman 
phylum of languages (Sun 1992; Driem 2001; Burling 2003). Sun (1992:80 fn. 18, 1993:11 
fn. 18), based on the sparse data available to him, suggested an affiliation with Bugun, 
Sartang, Sherdukpen, Khispi (Lishpa) and Duhumbi (Chugpa) in West Kameng (a group 
sometimes called *Kho-Bwa cluster" after (Driem 2001)). Little has been done to prove Sun's 
hypothesis, and although it indeed looks promising (Lieberherr and Bodt 2015), this question 
will not be addressed further here. 

This paper presents a comparison of three selected Puroik dialects: the westernmost 
variety of the village Bulu (B), a variety from the east spoken in several villages in the 
Chayangtajo area (CT) and the variety of the villages Kojo, Rojo and possibly Jarkam (KR), 
which lie geographically between Bulu and Chayangtajo. Two major question motivate the 
comparison: 1) Are the so-called Puroik dialects a coherent group? Le are the Puroik dialects 
reconstructible to a common ancestor? 2) Are the Puroik dialects Tibeto-Burman languages? 
Le. does the “core” of the Puroik lexicon contain a critical number of words with cognates 
and regular sound correspondences in Tibeto-Burman languages? This study is preliminary in 
the sense that for the first question only three Puroik varieties are taken into consideration, 
and for the second question, the comparison is limited to Kuki-Chin (using the data of VanBik 


! T would like to thank all Puroiks for their support and warm hospitality in their remote villages. The Puroik data 
for this paper was contributed by my friends Takia Soja from Sanchu, Shamil Painchey from Lasumpatte, Kagoi 
and Sanchang Rojo from Rojo, Phembu and Tshang Raiju from Bulu, John Rawa from Rawa, Agung Saria from 
Saria. They have no responsibility for eventual inaccuracies in my data. The original title of this paper at NEILS 
6 was "Phonological innovations in Puroik". I thank two anonymous reviewers and Linda Konnerth for their 
time and patience to go over previous versions. Many substantial improvements are owed to their detailed and 
constructive criticism. 
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2009). All Puroik forms, as well as preliminary reconstructions and possible etyma mentioned 
in the course of the paper, are listed again in the appendix. 

Puroik is not one language but a chain of dialects where geographically adjacent varieties 
are mutually intelligible but geographically distant varieties not necessarily. The “best 
known” of the three Puroik varieties is the dialect of Chayangtajo circle, East Kameng, where 
Sanchu is the biggest and best accessible Puroik village. Data of this variety has been 
published in various sources (Deuri 1983; Tayeng 1990; Remsangpuia 2008; Soja 2009). It 
has great similarities with the varieties described in Chinese sources (Sūn et al. 1991, Li 
2004), (see Table 2) and the dialect of Lasumpatte near the Assam border (own field notes). 
The dialect of Kojo-Rojo is spoken in two, possibly three villages (Kojo, Rojo, Jarkam), and 
is different but mutually intelligible with the dialect of other villages in Lada circle and to 
some extent with the dialect of Bulu. Nowadays Bulu is separated from the rest of the Puroik 
language area by a 3 days foot march. However it is said that the area where Bulu Puroik was 
spoken once upon a time was much more extended and almost contiguous with the rest of the 
Puroik speaking area. The variety of Bulu and the variety of the Chayangtajo area are not 
mutually intelligible, which I tested with several speakers from both sides by playing 
recordings from the other variety. Usually the other variety was not even recognised as 
Puroik. There are significant differences everywhere in the grammar and lexicon of these 
three languages. 


India 


AN 45, Sanchu (CT) 


S. D 


N 


Puroik 
"Kho-Bwa" 
Miji-Bangru 
Hruso 

Tani 


hii-hai isogloss 


nasal-stop isogloss 


Figure 1 - Linguistic map of western Arunachal Pradesh 


As an example for this may serve the case markers, of which hardly one seems to be 
reconstructible for Proto-Puroik. 
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Table 1 - Case markers in three Puroik dialects 


B KR CT 
OBJECT -ku -to -ro 
POSSESSIVE -uy -ta -ta 
INSTRUMENTAL -lapu -ta -ta 
ABLATIVE -lapu -ta -ta 
LOCATIVE Jo -la -la 
SOCIATIVE -laeN(ku) -kuy -kuy 


The lexical differences between Puroik dialects appear to be greater than differences 
between the western Kho-Bwa languages Sherdukpen, Sartang and Khispi-Duhumbi, which 
are all considered to be separate languages (Lieberherr and Bodt 2015). Sometimes lexical 
differences can be explained as loans from a contact languages such as Miji or Nyishi (e.g. 
FISH B gui < Miji, Saria poo < Nyishi, CT kahuar). But very often not. Table 2 shows some 
basic roots in the three dialects which most probably do not go back to one etymon and 
cannot be explained as borrowings either from Miji or from a Tani language. The forms in the 
dialect(s) described in Chinese sources is usually relatable to the Chayangtajo dialect. 


Table 2 - Divergent basic roots 


Gloss B KR CT Sun et al. 1991 Li 2004 
MEAT fii mai marjek mari miu? li^ ^ 
MONKEY (MACAQUE) maraN ` sedug  mezii gee modas 
PIG wa? dui madou mə”du” mə”'du 75 
CHICKEN Yar takjuu | sekuu ha ku? ha?'ku 75 
LOUSE (HEAD) fi? haè PiE ve mie pui?'yai ® 
ONE tyi kuu bui hui” çun”? 
DO/MAKE ta? 30u kaik kat” Kat 

SIT rii djao tuy ton” tog 75 
LICK lja? jaa vjaa la Živia” lau 75 
STAR haNwai? hadan  hagaik ha?yat ® ha?'yai 9 


Note that the meanings of the ten roots in Table 2 are basic and usually tend to be cognate 
in closely related languages. For example in the Tani languages most roots with these 
meanings are fairly similar and are reconstructible for Proto-Tani. 

The Puroiks are, of course, well aware of these differences. One commonly known 
dialect shibboleth is the existential copula wee : wai : wee which means ‘there is’ in the 
eastern dialects, but ‘there is not’ in the western dialects. Sometimes the dialects in the west 
are called hai sak ‘hai-language’, as opposed to hii sak ‘hii-language’ in the east (hee : hai : 
hii are the words for WHAT). The line that devides hii and hai-language, as well as the 
different uses of the existential copula, goes along the eastern Kameng river, which is a 
mighty river already high up in the mountains (dashed line in Figure 1) forming a natural 
barrier between the western and eastern dialect areas. Note that this line does not coincide 
with the nasal-stop isogloss (dotted line in the map) discussed below in §4. 

Generally full syllables in Puroik, i.e. not affixes, have the structure (Ci)(G)VV or 
(C)(G)V(V)Cr, where Ci /k, t, p, g, d, b, tf, dz, m, n, j, r, l, w, s, f, z, 3, f, h, V, G E 4, 1, r, I, 
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Cf €/?,k, p, 9, n, m/ and V from the set: /a, £, e, o, ł 1, 0, 5, u/. The consonant inventory of 
the Bulu dialect is slightly bigger contrasting /v/ vs./w/ and /tf/ vs. /ts/, /dg/ vs. /dz/, /f/ vs. /s/. 
VN in the Bulu dialect stands for an undefined nasal rhyme which might surface as [V, Vn, 
Vn, Vm], depending on the environment. Vowel digraphs (e.g. «ii, uu») always stand for 
tautosyllabic long vowels of heavy syllables. The sequence /CjV/ is interpreted as an onset 
cluster (not as a diphthong /CiV/), but the sequence /CuV/ as a simple onset with a following 
uV diphthong (not as an onset cluster /CwV/). 


2. Syllable initials 

2.1. Plosives and affricates 

Onset plosives /k, t, p, g, d, b/ correspond in all dialects, word initially as well as after a prefix 
or in a compound and are reconstructed as such for Proto-Puroik (Table 3- 8)?. The only 


poorly attested among the plosives is the voiced alveolar plosive g (Table 6). Plosives in 
combination with a glide will be discussed in 82.5. 


Table 3-k:k:k<*k 


Gloss B KR CT PP 

WATER koo kua kua *kua 

EYE a-kom a-kom a-kok *a-kom 
HEAD a-kuN a-kug-baa  a-kok-baa *a-koy 
SKIN a-ku? a-ki? a-kaa *a-kur” 
SMOKE br-kii ^ bai-koo ` becht bar-a" 
EAR a-kuiN ` a-kun a-kuik *a-kun 

UP kuN kun kun *kuy 
BRIDGE ka-tyiN ka-tun ka-tuik = *ka-tun 
PILLOW ka-kom ` kog-kom ` ko-kom ` *kog-kom 


Table4-t:t:t«*t 


Gloss B KR CT PP 

GIVE taN tay tan “tay 
THAT tee tai tee “tai 
BITE t22 tua tua *tua 
LIGHT (NOT HEAVY)  a-too a-tua a-tua *a-tua 
TOOTH ka-toN tuay ka-tuag — *ka-tuay 
SHOULDER pa-tiy pua-tuy pua-tok pua-toy 
NECK ka-tuN-rin  tuy-rin ka-tuy *ko-tun 
BRIDGE ka-tyiN ka-tun ka-tuik *ka-tun 


7 Segments compared are bold. Reconstructions of which some parts had to be guessed, because they contain a 
non-established correspondence, are followed by ©. Attested forms which I consider cognate but which 
correspond irregularly in at least one segment are in parenthesis (). Forms in square brackets [] are not 
considered to be cognate but are mentioned for the sake of completeness. Question mark ? means that the form is 
missing in the data. Not straightforward semantics are mentioned in footnotes. The Puroik data is all from my 
own fieldwork between 2012 and 2015. 
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Table5-p:p:p-*p 


Gloss B KR CT PP 
CUT (HIT WITHDAO)  peN pan paik *pan 
SEW pin pin pin *pin 
SWEET a-pin a-pin a-piy *q-pin 
BLUE a-pii a-pii a-pii *q-pii 
SWELL pan pan paik *pan 
THICK (BOOK) a-pan a-pen a-pik *a-pen 
PUROIK (prin-daa) purun puruik ` *purun 
BIRD pa-duu pa-doo po-dou  *pə-dou® 
NOSE a-puy a-puy a-pok *a-pon 
SHOULDER pa-tig pua-tuy pua-tok *pua-toy 
LEFT SIDE pa-fii pua-fee pua-fee — *pua-fee 
Table 6-g:g:g<*g 
Gloss B KR CT PP 
ISG guu goo goo *g00 
HAND/ARM a-ge? a-gei? a-geik *a-gat 
Table7-d:d:d < >d 
Gloss B KR CT PP 
GARLIC daN day dak *dan 
KNOW deN dan daik *dan 
CHILD a-daa a-doo a-dou *a-dou”) 
NINE duNgii donguee donguee  *don-gjee” 
BIRD pa-duu pə-doo pə-dou ` *pa-dou? 
Table 8 -b : b : b < *b 
Gloss B KR CT PP 
NEGATION ba ba ba *ba 
DREAM baN bay bak Zhang 
FIRE bee bai bee *bai 
SHY bii-weN bii-wan bii-waik *bii-wan 
SLEEPY ram-bin ram-bin ram-biy *rom-bin 
HEART a-luN-bao  a-lug-bao a-lok-bəə *a-log-boo 
SON-IN-LAW a-bo? bua? a-bua *bua? 
NAME a-bjeN a-baen a-biey *a-bjen 
FLOWER a-bueN hain-buan  ma-buaik *buan 
BEFORE bui bui bue *bui 
BELLY (EXTERIOR) a-lyi-buN = hui-bug  a-lue-buk ^ *a-lui-bur 
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Gloss B KR CT PP 


DOWN buu buu buu *buu 


The voiceless affricates before close and close-mid vowels correspond (Table 9), while 
preceding nucleus /a/ some yet unexplained variations occur (§2.5). Well corresponding 
examples for the voiced counterpart *dj are scarce. One is the deictic particle Gi : Gi: dze < 


*dse. 
Table 9 - t: d: 0 < *(f 


Gloss B KR CT PP 
KNIFE (MACHETE) fii fee fee *ffee 
EAT fii fii tfii *ffii 
STAND fin fin ffir *ffin 
DIG ffu? ffu? ffoo *ffo? 


Table 10 shows a summary of the onset plosive correspondences. 


Table 10 - Summary of plosive onset correspondences 
B KR CT PP # Table 


kk k *k 9 3 


t ¢ t N 8 4 
pp p p dh 3 
Zg g gW 2 6 
dd ub. *d 5 7 
bb b *b 12 8 
LYS Y Y 4 9 
g k & *d 1 


2.2. Nasals 


None of the three Puroik dialects discussed in this paper allows a syllable-initial velar nasal. 
Phonetic [n] occurs as an allophone of /n/ in front the vowel /i/, e.g. B /ni?/ [ni?] ‘two’. 
Elsewhere [n] is analysed as a cluster /nj/ (e.g. anjee : anjei : anjee ‘breast (female)’). Syllable 
initial nasal /m/ always corresponds (Table 11), syllable initial /n/ almost always (Table 12). 


Table 11 - m: m : m < zm 


Gloss B KR CT PP 
VOMIT mue? muai mue *muai? 
MUSHROOM min may may *may 
HAIR (ON BODY) a-min a-man a-muin *a-mun 
RIPE a-min a-min a-mir *a-min 
FULL/SATIATED (min) moy moy *moy 


3 4 refers to the number of examples. 
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Gloss B KR CT PP 
SKY ha-miy may ka-may *ha/ka-may 
CAN mueN muan muaiy *muan 
WAR mo? mua? mua *muar 
HOLD IN MOUTH mom ? mom *mom 
GHOST mo-lao mo-lau | mo-iaa *mo-laa 
FOOD ma-lueN ma-luan ma-luaik = *mo-luan 
Table 12- n: n: nc *n 
Gloss B KR CT PP 
2sG naa nay naa *nay” 
SMELL nam nam nay *nam 
BROTHER (YOUNGER) a-n22 a-nua a-nua *a-nua 
TWO ni? nii nii *njp 
NEAR a-nyi a-nui a-nui *a-nui 


In a few cases, /n/ in the western dialects corresponds to /r/ in the east (Table 13). A 
rhotacism n > r is typologically not uncommon’ and is the reason for reconstructing Zu. 
However, the condition for the change in Chayangtajo Puroik is not known yet, and I write 
the symbol *n for the time being. External evidence from Kuki-Chin also points to Zu (see 
Table 73). Whether the 1PL belongs here, where the Bulu dialect irregularly also has r like the 
CT. dialect, is questionable. External evidence could be taken to suggest that *n is original 
(Mizo kei-ni ‘we (exclusive)’ [Marrison 1967]). However, the KR dialect is in direct contact 
with Lada Miji and the 1PL ga-nii might be influenced from Miji 1PL a-ni (Abraham 2005). 


Table 13 -n:n:r<*n 


Gloss B KR CT PP 

DAY a-nii a-nii a-rii KS 
SUN? ha-mii ha-mii — k-rii *PFX-hii 
FLOW nye nuai rue REI 
LISTEN HIH nuy roy *noy 

BE ILL/SICK naN nay ray *han 

1PL (g-rii) ga-nii g-rei *ga-fiei 


Remarkably this n to r correspondence is also sporadically found in the contact languages 
of Puroik, Miji and Bangru, in a similar geographic distribution. 


^ E.g. in Tosk Albanian (Southern Albanian), Romanian and Korean. 

5 hami<*ham-nii (ham is the Western Puroik “sky”-prefix, which occurs in the roots for SKY, MOON, STAR, RAIN, 
SNOW), krii<*ka-nii (ka is the Eastern Puroik “sky”-prefix in SKY, CLOUD, SNOW, LIGHTNING). Western Puroik 
*mn > m is ad hoc. In the further discussion the root will not be listed anymore since it is homonymous with *nii 
‘day’. 
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Table 14 - Summary of nasal onset correspondences 
B KR CT PP # Table 


m m m *m 11 11 
n n n *n 5 12 
n n r *ý 5 13 
r n r N 1 13 


2.3. Non-nasal sonorant onsets 


The trill /r/ and the lateral approximant /l/ are distinct phonemes in all three dialects, and 
generally correspond (Table 15 and 16). An example for a reconstructible minimal pair is 
WEAVE (ON LOOM) "at-rua? vs. PENIS *a-lua?. 


Table 15 -r:r:r«*r 


Gloss B KR CT PP 
SHELF (OVER FIREPLACE) rap rap rak *rap 
FROG rə? rə? rao *ra? 

SIX rə? rə? rək *rok 
SLEEP rom ram ram *rom 
CANE rii rei rei Drei 
BURN (TRANSITIVE) rii rii rii *rii 
GUTS a-tyi-rin a-hui-rin | a-lue-riy *q-lui-rin 
RUN rin ren rik *rin 
PULL ryi rui rue *rui 
PRETEMPORAL CONV -ryila -ruila -ruila *-ruila 
PUROIK (prin-daa) purun puruik *purun 
WEAVE (ON LOOM) £P-Y2P ai-rua? ai?-rua *at-rua? 


Table 16 -1:1:1 « >) 


Gloss B KR CT PP 

LEG a-lee a-lai a-lee *lai 

BOW (lii) lei lei *Iei 
PENIS a-lo? a-lua? a-lua KUER 
FOOD ma-lueN ma-luan | mo-luaik *ma-luan 
HEART a-luN-baa a-lug-bao | a-lok-bao *a-log-bao 
WARM a-lom a-lam a-lap *a-lam 


See Table 28 and 29 for the special correspondences of /r/ and /l/ in combination with the 
palatal glide /j/. 

The labio-dental fricative /v/ and the labio-velar approximant /w/ are only in the Bulu 
dialect distinctive and in the three corresponding cases Bulu /v/ corresponds to /w/ in the 
eastern two dialect. The roots for FIVE and GO form a minimal pair in the Bulu dialect (wuu vs. 
vuu) but are homonyms in the Kojo-Rojo and CT dialect (uu). I assume that FIVE and GO form 
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a reconstructible minimal pair *wuu vs. *vuu and that the eastern dialects merged the two 


phonemes into /w/. 


Table 17 -w:w:w <*w 


Gloss B KR CT PP 

SAGO CLUB (TOOL) waN wan wak *wan 

FART wai? wai WEE *wai? 

LEECH pa-we? po-wai? | ka-waik *ka/pa-wat 

SHY bii-weN bii-wan bii-waik *bii-wan 

DRY a-wueN a-wuan a-wuaik *a-wuan 

HUSBAND a-wui a-wui a-wue *a-wui 

DOOR haN-wuiN | ha-wun ` ffuk-wuik | *HOUSE-wun 

FIVE wuu uu uu *wuu 
Table 18- v:w:w-c*v 

Gloss B KR CT PP 

3sG vee wai WEE WH 

GO vuu uu uu Tun 


There are three instances where a glide in the Chayangtajo dialect corresponds to a 
fricative in the Bulu and Kojo-Rojo dialect. I assume that the glide is original. The sound 
change >I >3 in the western two dialects is typologically comparable to the pronounciation of 
/j/ as a palatal fricative [j] in the dialects of South-American Spanish (e.g. <ayer> ‘yesterday’ 


as [a'jer]). 


Table 19-3:3:j<*j 


Gloss B KR CT PP 
AWAKEN (INTR) 3ao 3au jaa *jaa 
BREATHE guu guu joo *joo 
WING a-3uiN a-3un a-juik *a-jun 


2.4. Fricatives 


Table 20 - Summary of approximants and r 


B KR CT PP # Table 
rr F Ka 12 15 
1 1 I *] 6 16 
ww w *w 8 17 
vw w *y 2 18 
ioe “j 3 19 


For most fricative correspondences there are very few good examples. Four secure examples 
are available for the unvoiced labio-dental fricative f (Table 10). For the voiced counterpart 


see Table 18. 
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Table 21 -f:f:f-«*f 


Gloss B KR CT PP 

NEW a-feN a-fan a-faik *a-fan 
BLOW fuu fuu fuk *fuu® 
LEFT SIDE pa-fii pua-fee pua-fee *pua-fee? 
MAN a-fuu a-foo a-fuu *a-fuu? 


Sibilant correspondences are very unclear (Table 22 partly due also to unclear interactions 
with following u and j (end of $2.5). 


Table 22 - Sibilants 


Gloss B KR CT PP 

ALIVE a-seN a-san a-sik *a-sen 

1DU ga-se-nir/ ga-se-nii | ga-se- nii *ga-se-ni?? 
(ga-he-ni?) 

FINGERNAIL (age? ga-sin) gei-sin ^ gei-sik *ge-sin 

TEN sueN Juan suaik *suan” 

BONE a-zeN a-zan a-zaik *a-zan 

QUIVER zap zap zak Scan 


Only two examples for corresponding glottal fricative are available yet (Table 23). The 
root for BLOOD has a labio-dental fricative in the Kojo-Rojo dialect. Maybe the lip rounding of 
the following vowel u caused the glottal fricative to become labiodental. But this is an ad hoc 
rule as long as there are no other examples. 


Table 23 - KR *h ^ f/ u? 


Gloss B KR CT PP 
THIS hin hay hay *hay 
BLOOD a-hui a-fui a-hue *a-hui” 


The lateral fricative® is reconstructed from a correspondence between the Bulu dialect and 
the Chayangtajo dialect (Table 24). In KR younger speakers pronounce these roots with a 
glottal fricative. Older speakers pronounce the lateral fricative sometimes. It is unclear yet 
whether this is an archaism, influence from another dialect or another register of the language. 


Table 24 - 1: h (D :1« *t 


Gloss B KR CT PP 
BELLY (INTERIOR)  a-lyi a-hui a-lue *a-lui 
FALL lu? hu? (tu?) (Yok-lo) ` ok" 
GHOST ma-lao ` mo-hau (ma-tau) ma-taa *mo-laa 
SHADE a-tim a-him a-lap *q-tim” 
STONE ka-liy ka-huy (ka-tun) - *ka-tuy” 


$ [n the dialects of Puroik /1/ stands for a lateral fricative [1] and not for a voiceless lateral approximant [I]. 
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Table 25 - Summary of fricative correspondences 
B KR CT PP Condition # Table 


ds f £ TF 4 21 
S sS so "s 3 22 
s f s Ze u 1 22 
Zo taz Z e 2 22 
h h h *h 1 23 
h f h "h u ] 23 
toh i N 5 24 


2.5. Clusters 


Onset plosive-glide clusters in Bulu correspond to plosive-rhotic clusters in the two eastern 
dialects, although the pronunciation of [P1]’ in the CT dialect alternates with [Pj]. Within 
Puroik itself there is no evidence yet for the reconstruction of two separate *Pj and >D 
clusters. I reconstruct *Pj assuming a change *Pj > Pv in KR and CT, because PP clusters 
*nj, *rj and *lj have to be assumed anyway (Table 27, 28, 29). However, Lepcha and Nungic 
Trung might be taken to suggest that the r-sound in the eastern dialects could actually be old 
(NAME PP *a-bjen Lepcha 2ábryáng (Plaisier 2007) and Nungic Trung aif?! buur?? (Sin et al. 
1991)). 


Table 26 - Pj : Pa: Pa < *Pj 


Gloss B KR CT PP 
BRANCH a-kjee hain-kaei haen-kace — *kjai 

SAGO PICK (FRONT PART) kju? kauk kaok *kjok 

ANT (diamdgu?)) | gamguau? ` ga4engao *ejamgjo?? 
NINE duNgii donga4ee dog ez *don-gjee? 
LONG a-pjaN a-p4ar a-p4ar *a-pjay 
LIVER a-pjiN a-pjin a-pjik *a-pjin 
SCRATCH bju? bau? b400 *bjo? 
NAME a-bjeN a-buen a-baey *a-bjen 
BAMBOO (EDIBLE) ma-bjao ma-baau ma-buaaa | *ma-bjaa 
CRAZY a-bjao (a-b4aa) baaa-bo *a-bjaa? 


The example for LIVER suggests that PP *Pj does not become rhotic in KR and CT if 
followed by i. The root for ANT in the Bulu dialect shows a palatalisation, unlike in the root 
for NINE or in the voiceless velar-glide clusters. This might be due to an influence from 
Sartang.? 

No examples for Su and *dj could be found. The reason for this could be that part of 
correspondence 0 : f : f (Table 9) actually goes back to Su and not 29. Only one example is 


1 P for “plosive” 

8 Sartang sardo? (Abraham et al. 2005). Sartang is assumed to be a close relative of Puroik (see Introduction 
$1). The next Sartang village can be reached in one day from Bulu (next Puroik village 2-3 days). The mothers 
of five of six surviving speakers of Bulu Puroik were from the Sartang community. 
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available for corresponding nj and three examples for corresponding rj (Tables 27 and 28). 
Unlike in clusters with plosives and fricatives (Table 30) the glide does not become “rhotic” 
in the eastern two dialects in these cases. 


Table 27 - nj : nj : nj < *nj 
Gloss B KR CT PP 
BREAST (FEMALE) 


a-njee a-njei a-njee *a-njai 


Table 28 - rj : rj: rj < *rj 


Gloss B KR CT PP 
GREEN a-rjee a-rjei a-rjee *a-rjai 
WHITE a-rjuN a-rjun a-rjun *a-rjun 
TASTY/SAVORY (a-jim) a-rjem a-rjep *a-rjem 


The cluster */j is simplified to a glide *j in the Kojo-Rojo dialect. The root for LEAF -/ap : 
-jap : -lək is divergent having a glide onset in the Kojo-Rojo dialect corresponds to / and not Jj 
in the other two dialects (Table 29). Irregular is also the root for TONGUE which starts with a 
trill in the CT dialect. 


Table 29 -lj : j : lj < *lj 


Gloss B KR CT PP 

FULL ljee jei ljee *[jai 
SEVEN mo-ljee jei ljee *mo-ljai 
EIGHT ma-ljao jau (laa) *mo-ljaa 
LEAF a-lap (hain-jap) a-lak *[jop? 
TONGUE a-lyi jui (a-rue) *a-[ui? 


There are for instances where / corresponds to h4 in the Kojo-Rojo dialect. For unclear 
reasons the CT dialect has Aj in two cases. These roots were reconstructed to Zei because Bulu 
*sj2f is a more plausible change than */j-/ would be. Furthermore the root for BLACK 
suggests that */j is preserved in all dialects: a-hjeN : a-hjei : a-hj8 < *a-hjai 


Table 30 - f : ha: ha < *sj 


Gloss B KR CT PP 
URTICA FIBRES JaN haay haak *sjan 
FIREWOOD SiN hain haer zeiten"! 
WET a-fam a-haam a-hjap *a-sjam 
ROT Jam haam hjap *sjam 


Affricates and palatal fricatives are sometimes followed by /j/ or /u/ in a rather 
inconsistent way. For example BITTER a-ffa? : a-ffua? : a-ffjaa which has /fu/ in Kojo-Rojo 
and //j/ in Chayangtajo. In the similar root ABOVE affay : affjan : affuay the onset in the KR 
dialect is /f;/ and /ffu/ in the CT dialect. Further combinations are: TARO ffja?: far: tfua, 
FAT/GREASE a-3ua : a-zjaa : a-zua, CRY ffe? : fap : (tfjap), maybe FAR affoi : a-ffai : (atffee), 
HAIR (ON HEAD OF HUMANS) kazaN : (kazjan) : kozak, WIFE aguu : azjoo : azou, THORN mazuN 
: məzun ` (kazjoy). This requires further data and study. 
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Bulu Puroik is the only dialect to contrast /#/ and /ts/, Bulu Puroik /5/ seems to correspond 
to /fj/ in Kojo-Rojo. Since cluster and affricates have to be reconstructed anyway, 
reconstructing these two cases to Du is more economical than assuming an additional 
phoneme Die for the proto-language. 


Table 31 - ts : tfj : 0 < *tfj 


Gloss B KR CT PP 
OLD (OF THINGS) a-tseN a-fjen a-faik a-tfjan 
THIN (BOOK) a-tsap a-tfjap | a-ffap *a-fffjap 


As far as other onset clusters are concerned, there are hardly any good examples. One is 
MUTE blo? : blo? : blok « *blok 


Table 32 - Summary of cluster correspondences 
B KR CT PP # Table 


Pj Pa Pro ET 10 26 
nj nj nj — *nj 1 27 
rj rj rj Kai 2 28 
J rj rj Kai 1 28 
j j j U 2 29 
U j l Sr 1 29 
jf l *[j 1 29 
jf r *[j 1 29 
S ha hi Zei 2 30 
J tu hj Zei 2 30 
s M y KI 2 31 


3. Open Rhymes 
3.1. Short rhymes 


Short prefixes or light first syllables contain the vowel a or a (Table 33 and 34). 


Table 33 - Ca- : Ca- : Ca- < *Ca- 


GLOSS B KR CT PP 
KINSHIP PREFIX a- a- a- *q- 
BODYPART PREFIX a- a- a- *q- 
ADJECTIVE PREFIX a- a- a- *q- 
PREDICATE NEGATION ba- ba- ba- *ba- 
BRIDGE ka-tyiN  ka-tun ` ka-tuik *ka-tun 
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Table 34 - Ca- : Co- : Co- < *Co- 


GLOSS B KR CT PP 
GHOST ma-lao ma-lau ma-laa *ma-laa 
BIRD pa-duu pa-doo pə-dou *na-dou”) 
TOOTH ka-toN tuan ka-tuay *ka-tuay 
HAIR (ON HEAD) ka-zaN ka-zjay ka-zak *ka-zan 
NECK ka-tuN-rin tug-rin ka-tuy *ka-tuy 
1PL g-rii ga-nii g-rei *ga-nei 


The short vowel a also occurs in suffixes (Table 35). 


Table 35 - -Ca : -Ca : -Ca < *-Ca 


Gloss B KR CT PP 
PRETEMPORAL CONV -Cryila ` -ruila -ruila *-ruila 
IMPERFECTIVE -na -na -na *na 


3.2. Long rhymes 


Vowel correspondences are generally less clear yet. On the basis of the current data, the only 
reconstructible monophthong vowel is ii corresponding in all dialects (Table 36). 


Table 36 - ii : ii : ii < *ii 


Gloss B KR CT PP 
BLUE a-pii a-pii a-pii *q-pii 
DAY a-nii a-nii a-rii *q-nii 
BURN (TRANSITIVE) rii rii rii *rii 

DIE ii ii ii NRI 

SHY bii-weN bii-wan bii-waik *bii-wan 
EAT fii fii fii *fii 


The two i-diphthongs are /ai/ and /ui/ are attested with a number of examples (Table 37, 
38 and 39). As for the diphthong /ai/ is assumed that the Kojo-Rojo dialect preserved the 
original diphthong and that there was a monophthongisation in the dialects of Bulu and 
Chayangtajo (*ai>ee). 


Table 37 - ce : ai : ££ < “ai 


Gloss B KR CT PP 
FIRE bee bai bee *bai 
THAT tee tai tee *tai 
LEG a-lee a-lai a-lee *lai 
3sG VEE wai WEE *vai 
FRUIT JiN-wee hain-wai YON-WEE *wai 
FLOW nye nuai rue *nuai 
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The diphthong *ai was raised to ei in the Kojo-Rojo dialect after palatal glide (*ai>ei / j 
_). The diphthongs *ai and *ei are different phonemes in the Kojo-Rojo dialect (/ai ‘put’ vs. 


lei ‘bow’). 
Table 38 - ce : ei : eg < *ai/ j 

Gloss B KR CT PP 
SEVEN mə-ljee jei ljee *ma-ljai 
FULL ljee jei ljee *ljai 
GREEN a-rje£ a-rjei q-rje£ *a-rjai 
BRANCH a-kjee hain-kaei haeg-kaee *kjai 
BREAST (FEMALE) a-njee a-njei a-njee *a-njai 


The diphthong /ui/ corresponds in all dialects. After alveolar sounds [ź d, ts, œ, r, I, #] the 
diphthongs /ui/ and /ue/ are fronted to [yi] and [ye] in the Bulu dialect (but ui and yi are not 


different phonemes). 
Table 39 - ui : ui: ue < *ui 

Gloss B KR CT PP 
BEFORE bui bui bue *bui 
HUSBAND a-wui a-wui a-wue *a-wui 
BLOOD a-hui a-fui a-hue *a-hui 
NEAR a-nyi a-nui (a-nui) *a-nui”? 
TONGUE a-lyi jui a-rue *a-Iui? 
BELLY (INTERIOR)  a-]yi a-hui a-lue *a-lui 
PULL ryi rui rue *rui 
PRETEMPORAL -ryila -ruila -ruila *-ruila 
CONV 


The evidence for the existence of a possible Proto-Puroik diphthong *ei is scarce (Table 
40). The KR dialect is in direct contact with Lada Miji, and the 1PL ga-nii might be influence 


from Miji IPL a-ni. 


Table 40 - ii: ei : ei < “ei 


Gloss B KR CT PP 

CANE rii rei rei *rei 
FOUR vii waei wei *yjei 

1PL g-rii (ga-nii) g-rei *o-rei 


There is a small number of examples where long aa in the CT dialect corresponds to a 
diphthong in Bulu ao and in Kojo-Rojo. Because there is no iu, ou, au, eu this would be the 
only evidence for the existence of u-diphthongs in Proto-Puroik. Thus it is assumed, until 
there would more evidence for u-diphthongs, that the long aa in the CT dialect is original 


which was diphthongised in the western two dialects. 
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Table 41 - ao : au : aa < *aa 


Gloss B KR CT PP 
AWAKEN (INTR) ` 340 zau jaa *jaa 
BAMBOO (EDIBLE) ma-bjao ma-baau ma-biaa ` *mabjaa 
EIGHT ma-ljao jau (laa) *ma-ljaa 
CRAZY a-bjao (a-baaa) biuaa-bo — *a-bjaa 
GHOST ma-tao ma-tau ma-taa *ma-taa 


3.3. Bulu » vs. East ua 


Bulu Puroik has » where other dialects have a diphthong ua in open as well as in closed 
syllables (Table 42 and 43). 


Table 42 - 55 : ua: ua < *ua 


Gloss B KR CT PP 
WATER kaa kua kua *kua 
BITE t22 tua tua *tua 
MALE a-paa a-pua a-pua *a-pua 
FEMALE a-mja a-mua a-mua *a-mua 
BROTHER à-n33 a-nua a-nua *a-nua 
(YOUNGER) 

LIGHT a-ta9 a-tua a-tua *a-tua 
ITCH 22 a-wua a-wua *a-wua? 
FAT/GREASE 4-322 a-zjaa a-zua *a-zua ? 


Table 43 - 3C : uaC : uaC < *uaC 


Gloss B KR CT PP 
PENIS a-lo? a-lua? a-lua *a-lua? 
WAR mar mua? mua *mua? 
WEAVE (ON LOOM) dad ai-rua? ai?-rua *at-ruar 
TOOTH ka-taN tuay ka-tuay *ka-tuay 


Until recently, I was unaware of this fact because I was working with the only speaker in 
Bulu who, alone but very consistently, pronounces every /»/ as [ua]. This speaker is the oldest 
of the six surviving cousin brothers who still speak this language. Does this speaker possibly 
preserve an archaic pronunciation? The people of Bulu say, and even he himself agrees, that 
this is not the case. The oldest speaker’s mother, as well as his late wife were from Kojo- 
Rojo. The ua pronunciation is considered to be an eastern Puroik fashion, and [o] to be the 
original uncontaminated Bulu Puroik pronunciation. On the other hand, the mothers of the 
other five Bulu Puroik speakers were from the Sartang community and in Sartang the word 
for water is Do (Abraham et al. 2005) as compared to Puroik kua and it is not impossible that 
the 9-pronunciation is due to Sartang or Bugun? influence at some time in the history of the 


? Bugun WATER £o: 
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language. This is, however, less likely since not all items in Table 42 have cognates in 
Sartang. If *o is inherited, does this mean that under the new circumstances all reconstructions 
*ua for Proto-Puroik have to be revised to *5? This is possible. The reason why it was not 
done here is that the reconstruction of the diphthong *ua would still be necessary for alveolar 
rhymes with nucleus ua. These roots correspond exactly — as if u was a consonant — like 
regular an-rhymes eN : an : ai? (Table 44). 


Table 44 - ua nucleus with alveolar coda 


Gloss B KR CT PP 
FLOWER a-bueN hain-buan ma-buaik *buan 
FOOD ma-lueN ma-luan ma-luaik *ma-luan 
DRY a-wueN a-wuan a-wuaik *a-wuan 
CAN mueN muan muaiy *muan 


At some point the pre-forms of these roots in the Bulu and Chayangtajo dialect must have 
had a nucleus ua. From a proto-form *bon the Bulu forms a-bueN (not fa-boiN) and CT mə- 
buai? (not ?ma-boi?) cannot be understood. If *2>ua was not a parallel innovation in all three 
dialects, there is no way around assuming a nucleus *ua for Proto-Puroik. 


3.4. Summary of open rhyme correspondences 


Table 45 - Summary of open rhyme correspondences 
B KR CT PP Condition # Table 


Ca Ca Ca  *Ca 7 33,35 
Cə Cə Cə  *Co 6 34 
ao au aa  *aa 5 Al 
ü i ii NI 6 36 
& ai CC *ai 6 37 
ee ei Ep *ai j 5 38 
ii ei ei *ei 3 40 
ui ui ue “ui 3 39 
yi ui ue *ui  [talveolar] ` 5 39 
29 ua ua — *ua 8 42 


4. Closed Rhymes 

There are four types of closed rhyme correspondences: (1) nasal-nasal: with nasals in all 
dialects $4.1 (2) stop-stop: with final stops in all dialects $4.2 (3) glottal: glottal stop in Bulu 
and open syllable in the east $4.3. (4) nasal-stop: with nasals in the west (B, KR) and stops in 
the east (CT) $4.4. There are no examples of stops in the west and nasals in the east. 


4.1. Nasal-nasal 


The CT dialect has two contrastive nasal codas (-7, -m), the Bulu and Kojo-Rojo dialect have 
three contrastive nasal codas of three places of articulation (-7, -n, -m). However, in the Bulu 
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dialect there are no contrastive examples for -y vs. -n after the nuclei /a, e, 2, u/. The nasal 
coda in this dialect is in isolation often realised as a nasalisation of the nucleus, or if not word 
finally the place of articulation is assimilated to the following phoneme (e.g. /a-IuN-bao/ is 
pronounced as [alumbaa]). The grapheme «N^ stands for a “placeless” nasal. For Proto- 
Puroik it is assumed that the three way distinction in the Kojo-Rojo dialect is original (Tables 


46, 47 and 48). 


Table 46 - Final velar nasal-nasal 


Gloss B KR CT PP 
GIVE taN tag tay *tay 
ILL/SICK naN nay ray *nay 
ABOVE a-faN a-ffjag a-ffuag *a-ffuag A 
TOOTH ka-taN tuay ka-tuay *ka-tuay 
THIS hiy hay hay *hag 
MUSHROOM miy may may *may 
SKY ha-miy may ka-may *ha/ka-may 
LISTEN niy nuy roy *nog 
THORN ma-zuN ma-3uy (ka-zjoy) *ma/ka-zog 
STONE ka-lig ka-huy : *ka-duy 
NINE duNgii dugga4ee donguce *doy-gjee” 
UP kuN kuy kuy *kuy 
WHITE a-rjuN a-rjuy a-rjuy *a-rjug 
NECK ko-tuN-rin | tug-rin ka-tuy *ka-tuy 
Table 47 - Final alveolar nasal-nasal 
Gloss B KR CT PP 
CAN mueN muan muaiy *muan 
NAME a-bjeN a-baen a-baeg *q-bjen 
HAIR (ON BODY)  a-min a-man a-muiy *a-mun 
SWEET a-pin a-pin a-piy *a-pin 
DRINK in in [rig] *in 
STAND Yin yin fin *fin 
RIPE a-min a-min a-miy *q-min 
SEW pin pin piy *pin 
GUTS a-lyi-rin a-hui-rin a-lue-rig *q-lui-rin 
SLEEPY ram-bin ram-bin ram-biy *rom-bin 
FIREWOOD fiN hain haen zeien)" 
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Table 48 - Final labial nasal-nasal 


Gloss B KR CT PP 

SLEEP ram ram ram *ram 
PILLOW ka-kam kon-kam ko-kam *kon-kam 
HOLD IN MOUTH mom ? mom *mom 


4.2. Stop-stop 


All three dialects have two distinct stop codas (B -?, -p KR -2, -p CT -k, -p). Bulu and Kojo- 
Rojo do not distinguish final glottal stop and final velar stop. The reason for analysing them 
as glottal stops is that they are rather characterised by a property of the preceding vowel 
(short, high pitch and creaky) than by the closure. Roots with final -? in the Bulu and Kojo- 
Rojo and final -k in the CT dialect are reconstucted with final *-K. 


Table 49 - -2: -? : -k < *-k 


Gloss B KR CT PP 
SIX ra? ro? rok *rok 
FALL lu? lu? (liok-lo) ` *lok 
MUTE/STUPID blo? blo? blok *blok 
SAGO PICK (FRONT PART) kju? kau? kiok *kjok 


Stop-stop roots with /Vik/ rhyme in CT variety were reconstructed to Vt analogy to the 
nasal-stop correspondence (Table 50). Further evidence for this reconstruction comes from 
the dialect of Rawa (between Kojo-Rojo and Chayantajo), which preserves Vt : Rawa at 
‘cloth’, at ‘to kill’, pawat ‘leech, gat ‘hand’, bit ‘extinguish’. 


Table 50 - 2: i2: ik < *t 


Gloss B KR CT PP 
CLOTH cd ai? aik *at 
KILL [we?] ai? aik *at 
LEECH [pa-]we? [pao-]wai? ^ ka-waik *ka-wat 
HAND/ARM a-ge? a-gei? a-geik *q-gat 
EXTINGUISH (INTR) [ge?] bi? bik *bit 


Final bilabial stops in the Bulu and Kojo-Rojo dialect correspond to final stops in the 
Chayangtajo dialect (Table 51). However this dialect has sometimes a final -k instead of the 
expected final -p (see also Table 56). The dialect of Sario and Saria, which is the first Puroik 
dialect west of the Kameng River allows only one stop in coda (-k). The Chayangtajo dialect 
looks as if the process of merging all final stops (*-k, *-t, *-p> *-(i)k) was only half way 
completed. While the merger in the Sario-Saria dialect is exceptionless, in the CT dialect only 
final alveolars changed (*-t>-ik) and only some labials (*-p>-k). Language contact and 
borrowing might be an explanation for this distribution. 
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Table 51 - -p : -p : -p/k < *-p 


Gloss B KR CT PP 
SHELF (FIREPLACE) rap rap rak *rap 
LEAF a-lap (hain-jap) a-lak *[japO 
CRY (ffe?) yap op "ffjap 
THIN (BOOK) a-tsap a-fjap a-ffap *a-ffap 


4.3. Glottal 


There is a second class of correspondences involving glottal codas. In number of cases the 
western dialects of Bulu and Kojo-Rojo have a glottal stop coda but the dialect of 
Chayangtajo an open syllable. The numeral Two in the Kojo-Rojo dialect might be influenced 
from Bengni (Western Tani). 


Table 52 - Final glottal stop 


Gloss B KR CT PP 
CAVE wu? ue 00 *wor”) 
SKIN a-ku? a-ki? a-kaa *a- up 
CUT (WITHOUT i? i? ii *jp 
LEAVING BLADE) 

TWO ni? (nii) nii *nip 
TARO tja? tja? fua *fuap? 
BITTER (a-ffa?) a-ffua? a-ffjaa *a-ffuap 
WEAVE (ON LOOM) €?-r0? ai-rua? ai?-rua *at-ruaP 
PENIS a-lo? a-lua? a-lua *a-lua? 
WAR mo? mua? mua *mua? 
FROG rap rap raa *rap 


Kojo-Rojo drops the glottal coda after the diphthong ai (Table 53). 


Table 53 - £? : ai: £ € “ai? 


Gloss B KR CT PP 
FART wai? wai WEE *waip 
VOMIT mue? muai mue *muaiP 


These cases were reconstructed to PP *-?, which means that Proto-Puroik is assumed to 
have a four-way contrast of stop codas (*-2, SE *-t, *-p). This looks somewhat suspicious 
since the maximum of distinctive coda stops, in any extant Puroik dialect I know about, is 
three (Rawa -k, -t, -p). That a proto-language might have more segments than any attested 
daughter language is not a priori impossible? Further research might reveal more evidence — 


10 For example, the “laryngeal theory” for Proto-Indo-European reconstructing three place holders /1;, h: and hs, 
has proven to be very powerful and is able to account elegantly for many questions in many languages. Although 
only one of the reconstructed laryngeals is directly attested in only one branch of Indo-European (h2 in 
Anatolian), most Indo-Europeanists assume one form or the other of the “laryngeal theory". 
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maybe even direct evidence from a undescribed dialect — for or against the reconstruction of a 


glottal coda. 


4.4. Nasal-stop 


In a great number of examples final nasal in the dialects of Bulu and Kojo-Rojo dialect 
correspond to a homorganic final stop in the dialect of Chayangtajo. It is not clear to what 
these codas have to be reconstructed (see discussion below $4.5). For the time I use the 
symbols -7, -n, -m for this type of correspondence (Table 54, 55 and 56). 


Table 54 - Final velar nasal-stop 


Gloss B KR CT PP 
DREAM baN bay bak *bay 
HAIR (ON HEAD) ka-zaN ka-zjay ka-zak *ko-zag 
URTICA FIBRES JaN haay haak *sjay 
GARLIC (HOOKERI) daN day dak *day 
SAGO CLUB (TOOL) waN way wak *wag 
HEART a-luN-bao | a-lug-bao a-lok-baa *a-log-bao 
NOSE a-puN a-puy a-pok *a-poy 
SHOULDER pa-tiy pua-tuy pua-tok pua-toy 
HEAD a-kuN a-kuy-baa_ | a-kok-bao *a-koy 
BELLY (EXTERIOR)  a-lyi-buN ` hui-buy a-tue-buk *a-tui-buy 


The alveolar coda *-7 triggered an epenthetic i in the Bulu and Chayangtajo dialect (*Vn 
> *Vin). This epenthesis must have been independent because there is no evidence for an i- 
epenthesis in Kojo-Rojo. In the Bulu dialect the epenthesis occurred before the 
monophthongisation *ai > ee (Table 37 and 38), because diphthongs arising from this 
epenthesis are in the same way monophthongised like inherited diphthongs (CUT *pan > 
*pain > peN like FIRE *bai > bee). In the CT dialect the epenthesis must have happened after 
the monophthongisation of *ai (CUT *pan > *pain > paik different from FIRE *bai > bee). 


Table 55 - iN: n: ik < *-ñ 


Gloss B KR CT! PP 

BONE a-zeN a-zan a-zaik *a-zan 
NEW (OF THINGS) a-feN a-fan a-faik *a-fan 
CUT (HIT WITH DAO) peN pan paik *pan 
DRY a-wueN | a-wuan a-wuaik *a-wuan 
SHY bii-weN | bii-wan bii-waik *bii-wan 
FOOD ma-lueN ` ma-luan ma-luaik *ma-luan 
FLOWER a-bueN hain-buan ma-buaik *buan 
SWELL pan pan poik *pan 


I! To my knowledge, the Chayangtajo dialect has no contrast between final velar and final alveolar neither for 
nasals nor for stops. Occasional pronunciation of /-ik/ as [it] might be secondary or the influence of a dialect 
further east (the dialect further east of the Chinese data has final /it/), rather than an archaism. 
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Gloss B KR CT" PP 
NIGHT/DARK a-ffeN a-ffen a-ffik *a-ffen 
THICK (BOOK) a-pan a-pan a-pik *a-pan 
ALIVE a-seN a-san a-sik *a-sen 
LIVER a-pjiN a-pjin a-pjik *a-pjin” 
RUN rin ren rik *rin 

EAR a-kuiN a-kun a-kuik *a-kun 
DOOR haN-wuiN ha-wun ffuk-wuik *HOUSE-wun 
BRIDGE ka-tyiN | ka-tun ka-tuik *ka-tun 


The examples for the bilabial nasal-stop correspondece have a final bilabial nasal in the 
dialects of Bulu and Kojo-Rojo and a stop in the CT dialect. The stop in the CT dialect is 
sometimes a velar and sometimes a labial (see also the explanation to Table 51). 


Table 56 - -m : -m : -p/k <*-m 


Gloss B KR CT PP 

EYE a-kam a-kam a-kak *a-kom 
MOUTH a-sam a-sam a-sak *a-sam 
MORTAR satsom ffunfom ffjugtfok *fun-fom 
PATH lim lim lik (Saria) — "lim 
WARM a-lom a-lam a-lap *a-lom 
SHADE a-lim a-him a-łəp *a-Hiin 
WET a-fam a-haam a-hjap *a-hjam 
ROT fam haam hjap *sjam 
TASTY/SAVORY (a-jim) a-rjem a-rjep *a-rjem T) 


4.5. Summary of closed rhyme correspondences 


Table 57 - Summary of coda correspondences 


B KR CT PP Condition # Table 

P -n -0 *y 14 46 
-iN -n -in TH 11 47 
-m -m -m *-m 3 48 
-? -P -k SE 4 49 
-? -i? -ik Wi 5 50 
-P -P -P TD 2 51 
-p -p -k TD 7 2 5] 
-? -? -0 *? 10 52 
-? -Ø -Ø *? ai_ 2 53 
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B KR CT PP Condition # Table 

P -y -k Wi 11 54 
-iN -n -ik TH 16 55 
-m -m -p WO 5 56 
-m -m -k Wi 7 4 56 


There is no reason not to reconstruct a nasal for nasal-nasal codas (Table 46-48) and a 
stop for the stop-stop coda (Table 49-53). The only question is what to do with the 36 nasal- 
stop examples in Table 54-56? Do they (a) also go back to a nasal? (b) also go back to a stop? 
(c) go back to something which is neither a nasal nor a stop? Or is the answer a combination 
between (a), (b) and (c), i.e. some go back to nasal, some to stop and some to something 


third? 


The symbols Zo, 


* and *-m are simply placeholders (they could also be called x1, x2 


and x3). From Puroik there is no evidence so far what could be original. If nasals were 
original the question will be why in the eastern dialect some nasal codas were denasalised and 
some in an almost identical environment not (Table 58). 


Table 58 - (Near) minimal pairs for nasal rhymes 


Gloss B KR CT PP 

RUN rin ren rik *rin 

GUTS a-lyi-rin a-hui-rin a-lue-riy *a-lui-rin 
GARLIC daN dag dak *dag 

GIVE taN tag tag *tay 

EYE a-kam a-kam a-kak *a-kam 
PILLOW ka-kam kog-kom ^  ko-kom *kon-kam 
WARM a-lam a-lam a-lap *a-lom 
SLEEP ram ram ram *ram 


On the other hand if stops were original one would have to explain why the western two 
dialects nasalised some codas and other in a similar environment not (Table 59). 


Table 59 - (Near) minimal pairs for stopped rhymes 


Gloss B KR CT PP 
LEECH [pa-] we? [pa-]wai? | ka-waik *ka-wat 
SHY bii-weN bii-wan bii-waik *bii-wan 
QUIVER zap zap zak *zap 
MOUTH a-sam a-sam a-sak *a-som 
SHELF (OVER rap rap rak *rap 
FIREPLACE) 

ROT fam haam hjap *sjam 
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This situation is unsatisfying, especially because the nasal-stop class are not just a handful 
of exceptions but happens to be the most numerous correspondence class (nasal-stop 36, 
nasal-nasal 28, stop-stop 13, glottal 9). A comparison with proposed cognates in Kuki-Chin 
(85.1) shows that the nasal-stop class corresponds to nasal with the alveolar nasal-stop class 
(*-5) also corresponding to PKC *-r. If this is taken as evidence that the nasals are original, 
then a rule for a denasalisation in the eastern Puroik varieties has to be found. The 
documentation and analysis of the archaic looking Puroik varieties between Kojo-Rojo and 
Chayangtajo (Poube, Rawa) might shed light on what this rule might have been. 


5. Puroik rhymes compared to Kuki-Chin 


There are two questions motivating a comparison of Puroik with other Tibeto-Burman 
languages. First, the question of the relation of Puroik to the Tibeto-Burman family. Is there a 
inherited core vocabulary that is cognate to vocabulary in other Tibeto-Burman branches? 
Second, if yes, can other languages help to understand problems in the reconstruction of 
Proto-Puroik? In order to keep the scale of this article manageable, the comparison was 
limited to one sub-group of Tibeto-Burman, Kuki-Chin with the data set given in VanBik 
(2009). Kuki-Chin was chosen for the comparison because it is geographically distant enough 
that any kind of recent borrowing between Puroik and Kuki-Chin can be excluded. If Puroik 
and Kuki-Chin share cognate basic roots, these are likely to be inherited from a common 
ancestor. A second reason for choosing Kuki-Chin is that Kuki-Chin has a rich inventory of 
coda consonants which might help understand the codas in Puroik. And third, the great 
amount of data assembled and analysed in VanBik's (2009) reconstruction of Proto-Kuki- 
Chin facilitates the comparison enormously". 


5.1. Closed rhymes compared to Kuki-Chin 
The coda *-? was reconstructed in $4.3 on the basis of roots which have a final glottal stop in 
Bulu and often in Kojo-Rojo, but no coda in the east. The corresponding coda in suggested 


cognates in Proto-Kuki-Chin is *-2or *-k (Table 60). 


Table 60 - Glottal codas compared to Kuki-Chin 


Gloss B KR CT PP PKC Mizo 

TWO ni? (nii) nii *nip *ni? »zIhni? hnih 

DIG ffu? ffu? ffoo *fo? *tsaw, tso?-II chó-I, chawh-II 

FART wai? wai wee ` *wai? *woy? X wey? voy? (HL) 

CUT (WITHOUT i? i? ii NE *thip sam thi?” 

LEAVING BLADE) 

SKIN a-kuP  a-ki?  a-koo  *a-ku? *khok-I, kho?-1I khok-I, kho?-II (HL)'* 
SON-IN-LAW a-bo? bua?  a-bua *bua? *maak máak pà 

CAVE wu? u? 00 *wo? *thuuk tháuk? 


? The following tables contain a column with a reconstruction for Proto-Kuki-Chin and a column containing a 
form from a modern language illustrating the reconstruction, in most cases from Mizo (Central Chin). All data 
from Kuki-Chin are cited as in VanBik (2009). 

P *comb' (n.) 

peel, strip off 

15 «deep, profound’ 


258 


14 < 


15. A progress report on the historical phonology and affiliation of Puroik 


The stop codas *-k, *-t, *-p also correspond to stops in Kuki-Chin (Table 61). 


Table 61 - Stop codas compared to Kuki-Chin 


Gloss B KR CT PP PKC Mizo 

SIX rae rae rak *rok *ruk rük 

FALL lu? lu? (jok-lo) *lokO ` — *kluu-I, kluuk-II — tlà-I, tlüuk-II 
LEECH [pa-]we? | [pa-|wai? ka-waik *ka-wat *watxwotxwut | vàng vat 

KILL [we?] ai? aik *at that-I, bad thát-I, thah-II 
HAND/ARM a-ge? a-gei? a-geik *a-gat *kut X khut küt 
EXTINGUISH (INTR) [ge?] bi? bik *bit *mit mit-[, mi?-1I 
CRY (yer) Yap jap zap! —*krap-L kra?-II  tàp-I, tàh-II 
SHELF ( FIREPLACE) rap rap rak *rap *rap ràp 


Nasal codas correspond to nasal codas in Kuki-Chin (Table 62). 


Table 62 - Nasal codas compared to Kuki-Chin 


Gloss B KR CT PP PKC Mizo 

2sG naa nag naa *nag *nay náng 

STONE (ka-lig) ` ka-hug - *ka-Iu T) *lug lun 

FIREWOOD JiN hain haen *sjen® *thiy thin 

DRINK in in [rir] *in ? in” 

MARROW (a-lyiN) a-hin a-tiņ *ą- lin *khlik'® »«khlig thlíng 

RIPE a-min a-min a-miy *a-min *hmin hmín 

SMELL nam nam nag *nam *nam nám 

PILLOW ka-kom ` kog-kem ko-kam  *kog-komÜ *kham <khum ` lu-khàm (FL)? 
HOLDIN MOUTH mom ? mom *mom *hmoom hmáwm 


For the nasal-stop class (*-7 , So *-m ) it appears that the nasal-stop coda corresponds in 
almost all cases to a nasal in Proto-Kuki-Chin (Table 62). 


Table 63 - Nasal-stop codas compared to Kuki-Chin 


Gloss B KR CT PP PKC Mizo 
DREAM baN bag bak *bay *may mang 
HEART a-luN-bəə  a-lug-bao a-lok-baa —_*a-loy-baa *luy lung 
BELLY a-tyi-buN ` hui-bug  a-łue-buk ~—*a-tui-buy (*pum pim)? 
(EXTERIOR) 

FINGERNAIL age? ga-sin gei-sin gei-sik *ge-sin *fin tin 


16 The Bulu form would be z? is expected to have a labial coda gn A similar alternation in Kuki-Chin is 
grammatical and not necessarily related to the question in Puroik (Linda Konnerth p.c.) 

7 From Weidert (1975) 

15 The stop coda occurs only in Lai. 

I? The first part also means ‘head’ (Falam Lai /iiu), like the first part in Puroik Zo 

20 Kuki-Chin has final bilabial, Puroik final velar. 
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Gloss B KR CT PP PKC Mizo 

ALIVE a-seN a-san a-sik *a-sen *hrig-I, hrin-Il ` hríng-I, hrin-II 
LIVER a-pjiN a-pjin a-pjik *a-pjin *thin thin 

DOOR haN-wuiN ` ha-wun ` fuk-wuik | *HOUSE-wuR "thun x than ` thin?! 
MORTAR satsom fugfom ` tfjugf2k *fur-fomO *shum sum 

WARM a-lam a-lam a-lap *a-lom Sum Xhlum ` lum 

PATH lim lim lik (Saria) *lim *lam lam 

SHADE a-lim a-him a-lap *a-tim *hli(i)m hlim 

TASTY (a-jim) a-rjem a-rjep *a-rjem *lim SS 

THREE im Jim wk oi" *thum pà-thüm 


No Puroik dialect allows r or / codas. In the two available cases, final *-/ in Kuki-Chin 
corresponds to *-n in Proto-Puroik (Table 64), while all r-final roots with plausible cognates 
in Puroik correspond to the nasal-stop class *-7 (Table 65). This is the only external evidence 
so far that Puroik nasal-stop rhymes could have a different origin than nasal-nasal rhymes. 


Table 64 - Kuki-Chin /-codas compared to Puroik 


Gloss B KR CT PP PKC Mizo 

HAIR (ON BODY) a-min a-man a-muiy *a-mun *mul x hmul hmil 

GUTS a-lyi-brin a-hui-rin | a-lue-rig — *a-tui-rin *ril x rul ril 

Table 65 - Kuki-Chin r-codas compared to Puroik 

Gloss B KR CT PP PKC Mizo 

FLOWER a-bueN hain-buan ma-buaik = *buan *paar páar 

SWELL pon pon poik (*pan *puar püáar)? 

DRY a- a-wuan a-wuaik ^ *a-wuan ` *waar váar^* 

wueN 

NEW (OF a-feN  a-fan a-faik *a-fan (*thar thar) 

THINGS) 

OLD (OF a-tseN a-fjen a-ffaik a-ffjan (*tar tar) 

THINGS) 

EAR a-kuiN | a-kun a-kuik *a-kun *khur X<khor ` khür? 
21 ‘put in’ 


? Not attested in Central Chin. But Tiddim /im? (Northern Chin). 

233 Onsets do not correspond according to the expected pattern. Puroik is expected to have b like FLOWER. 

24 “white, to be light (not dark)’ Maybe because colour of clothes, grass, leaves, stones is lighter when they are 
dried. 

25 “hole, pit, cavity’ but ‘ear canal’ in Sorbung, Moyon, Kom Rem (Northwestern Kuki-Chin, previously “Old 
Kuki”) 
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5.2. Open rhymes compared to Kuki-Chin 


Open rhymes in Puroik also correspond to open rhymes in Kuki-Chin (Table 66). 


Table op - PP *-ii : *-ii 


Gloss B KR CT PP PKC Mizo 

DAY a-nii a-nii a-rii *a-nii *nii ní 

DIE ii ii ii KI *thii-I, thi2-1I thi-[, thih-II 
PERSON [prin] bii bii *bii *mii mî 


What VanBik analyses as a consonantal coda PKC *y, corresponds to the second element 
of i-diphthongs in Puroik (Table 67). The reason for the different “nuclei” in the potentially 
cognate forms, however, is not transparent yet. 


Table 67 - i-diphthongs compared to Kuki-Chin 


Gloss B KR CT PP PKC Mizo 

FIRE bee bai bee *bai *may méi 

NEAR a-nyi ` a-nui a-nui *a-nui —— *naay 2 hndi-I, hnaih-II 
hnaay 

FLOW nye nuai rue *ńuai *hnaay hnái 

TONGUE a-lyi jui a-rue *q-lui® *lay léi 

FRUIT fiN-wee hain-wai | rog-wee *wai (*thay théi) 

CANE rii rei rei *rej *ruy x hruy — hrüi 

BREAST (FEMALE) a-njee ` a-njei a-njee *a-njai *hnooy hnóoy (FL) 


5.3. Summary of rhyme correspondences 


The examples available provide preliminary evidence that the Puroik nasal-nasal class 
corresponds to nasal in Kuki-Chin, the stop-stop class to stops and the nasal-stop class *-7, 
*-n, *-m correspond to nasals, as in the western dialects of Puroik. If the comparisons in 
Table 65 (- Kuki-Chin r-codas compared to Puroik) turn out to be true correspondences, this 
would prove that, at least for alveolars, there must be something behind the symbol PP *7, 
which is neither nasal nor stop. It is not possible that there were only two Proto-Puroik coda 
classes if Kuki-Chin r-codas are consistently represented differently from >r and *n. 


6. Onset correspondences with Kuki-Chin 
6.1. Onset plosive correspondences 


From the few available examples, it appears that Puroik voiceless plosives correspond to 
unvoiced aspirated plosives, and Puroik voiced plosives to unvoiced plosives in Kuki-Chin. 
However for the first one there are only examples for the velar place of articulation. The 
reason why examples for Puroik r vs. PKC *th are missing, is probably because Kuki-Chin 
*th has a different origin («*0, see $6.5). 


?6 “nus, sap, juice, exudation’ 
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Table 68 - k compared to Kuki-Chin kh 


Gloss B KR CT PP PKC Mizo 
PILLOW ka-kom  kog-kom  ko-kom  *kog-kom  *kham ze lu-khàm 
khum 
EAR a-kuiN | a-kun a-kuik — *a-kun *khur x khor — khür" 
SKIN a-ku? | a-ki? a-koo *a-ku? *khok-I, kho?-II — khok-I , kho?-II (HL)? 
SMOKE be-kiüi — bai-koo bee-kii — *bai-kiü *may-khuu méi khá 


Table 69 - Voiced plosives compared to Kuki-Chin unvoiced unaspirated plosives 


Gloss B KR CT PP PKC Mizo 
ISG guu goo goo *g00 *kay 3€ kay-ma? kér (ka) 
IPL g-rii go-nii g-rei *ga-riei ? keini 
HAND/ARM a-ge?  a-gei? a-geik *q-got *kut x khu’ küt 
CHILD a-doo ` a-doo a-dou *a-dou? *tuu tu-té 
FLOWER a-bueN  hain-buan ma-buaik *buan *paar paar 
BELLY a-lji- | hui-bug a-lue-buk *a-lui-bug (*pum pum) 


(EXTERIOR) buN 


A sure counter-example is the root MALE/FATHER”? (Table 70), which is unaspirated in 
Kuki-Chin. This could be because ‘mama-papa’-roots tend to be phonologically rather 
universally similar than strictly subject to sound changes. There is no good explanation for 
SWELL yet. Maybe these roots are not cognate at all, though the codas and semantics 


correspond wel 


Table 70 - Counter-examples to stop correspondence 


Gloss KR CT PP PKC Mizo 
MALE/FATHER q-poo ` a-pua a-pua (*a-pua *paa pa) 
SWELL pan pan paik (*pan *puar puar) 


6.2. b to m correspondence 


In a few but basic lexical items a Puroik syllable initial /b/ corresponds to /m/ in other Tibeto- 
Burman languages (Table 71). This was first noticed by Jackson Sun (1992, 80) and Matisoff 


(2009, 309). 


27 ‘hole, pit, cavity’ but ‘ear canal’ in Sorbung, Moyon, Kom Rem (Northwestern Kuki-Chin, previously “Old 


Kuki”) 
28 *peel, strip off 


?? Aspirated only in Northern Chin varieties. 

30 The root means both ‘male’ and ‘father’ in both branches. ‘Male’ in Bulu, Kojo-Rojo. ‘Father’ in Chayangtajo 
and Mizo. ‘Father’ and ‘male’ but with different tones in both Lai varieties described by VanBik (2009). 

lit is unclear to what the PKC nucleus ua is supposed to correspond. 
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Table 71 - b to m correspondence 


Gloss B KR CT PP PKC Mizo 

FIRE bee bai bee *bai *may méi 

DREAM baN bay bak RI? *may máng 

SON-IN-LAW a-bo? bua? a-bua "hua *maak máak pa 

NAME a-bjeN a-ben | a-baeg — *a-bjen (*min Ax hmíng) 
*hmin 

PERSON [prin] bii bii *bii *mii mí 

EXTINGUISH [ge?] bi? bik *bit *mit mit-I, mi?-II 

(INTR) 

NEGATION ba- ba- ba- *ba- *ma-^ - 


Counter examples are given in Table 72: 


Table 72 - Counterexamples to b to m correspondence 


Gloss B KR CT PP PKC Mizo 

HAIR (ON BODY) a-min a-mən a-muiņ *a-mun *mul XxX hmiül 
hmul 

RIPE a-min | a-min a-miņ *a-min *hmin hmín 

HOLDINMOUTH mom ? mom *mom *hmoom hmáwm 

FEMALE/MOTHER a-mə2  a-mua a-mua *q-mua [*maw mó” 


Matisoff (2009: 309), who examines the data from Li (2004), formulates the 
correspondence as a general Puroik denasalisation “pTB *nasals > Sulung voiced stops”. He 


gives three counter-example to the rule which are: ‘corpse’ *s-may > ca1?*muan 75 ‘smell’ 


*m/s-nam > nar 75 ‘ripe/well-cooked’ *s-min > a?'min 75 and explains that the nasal coda of 
these roots could have “blocked the denasalization of the initial". And: “The Sulung form for 
DREAM evidently descends from the stop-final allofam that is also found in Lolo-Burmese 
(e.g. WB 2ip-mak, Lahu yi?-má?y". 

I think that Matisoff's rule is too general, and the reasons given for the counter-examples 
are questionable: First, there are hardly any non-labial examples to instantiate a general rule 
PTB *nasals > Puroik voiced stops. On the contrary all available examples suggest that what 
is *n or *hn in Kuki-Chin corresponds to Puroik *n (Table 73) or eventually to *z (Table 74). 
Hence, Matisoffs second counter-example ‘smell’ *m/s-nam > nay”? does not need any 
explanation. 


The negation *ma- is not reconstructed to PKC by VanBik, but see DeLancey this volume. Preverbal negation 
ma- is also well attested in other branches of TB (e.g. Miji-Bangru ma-, Proto-Ao *ma-). 
33*bride, a daughter-in-law, a sister-in-law, a brother's wife’ 


263 


North East Indian Linguistics 7 


Table 73 - n to n correspondence 


Gloss B KR CT PP PKC Mizo 

SMELL nam nam nay *nam *nam nam 

2sG naa nay naa "nay? — *nag náng 

BROTHER à-n22 a-nua a-nua  *a-nua  *naaw náu 

(YOUNGER) 

BREAST (FEMALE)  a-nje& ` a-njei a-njee  *a-njai ` *hnooy hnóoy (Falam Lai) 
NEAR a-nyi a-nui (a-nui) *a-nui *naayXhnaay hnăi-I, hnàih-II 
TWO ni? (nii) nii *ni? *ni? xX hni? ` hnih 


Table 74 - n to n correspondence 


Gloss B KR CT PP PKC Mizo 
DAY a-nii a-nii a-rii *a-hii ` *nii ní 
FLOW nye nuai rue Tue — *hnaay hnái* 


The only possible non-labial example in Matisoff's as well as in my data is the 1SG *goo 
as compared with for example Tani *yo. However, Puroik *goo does not necessarily have to 
be compared with nasal forms, but could as well be cognate to non-nasal 1sG forms across 
TB, such as in Mizo kér (enclitic ka) or Lepcha 1sG go (Plaisier 2007). 

Secondly, I does not strike me as “evident” that the root *bay for DREAM descends from a 
stop-final allofam because the western two dialects of Puroik have final nasal and this class of 
roots correspond to final nasal at least as far as the comparison to Kuki-Chin goes (Table 
63y5. 

Thirdly, Matisoff's first counter-example ‘corpse’ *s-may > q24?5nuar ?? is probably a 
Tani loan (Bokar So-moy ~ Si-moy (Sun 1993)). This root does not occur in the western two 
Puroik dialects, which were less influenced by Tani languages. 

Matisoff does not list the root for NAME with the examples for the b to m correspondence. 
I think it is correct correct that the Puroik root and the Kuki-Chin root are not directly 
comparable. The Puroik root has a /a~e/ nucleus and a complex onset while the Kuki-Chin 
root has a simple onset and a nucleus /i/, The comparison proposed by STEDT?? with Lepcha 
Pábryáng (Plaisier 2007) and Nungic Trung arf? buur?? (Sūn et al. 1991) is more likely, since 
those roots also have a complex onset with b and a nucleus which is not i. 

Considering the newly available data, Matisoff's rule seems questionable. But is there a 
better way to explain the counterexamples to the syllable initial b to n correspondence? If the 
root for NAME is really not cognate, it appears that all remaining forms in Table 71 
corresponding to b in Puroik have a voiced onset m in Mizo, and three of four 
counterexamples in Table 72 start with a voiceless /;m in Mizo. The counter-example 
FEMALE/MOTHER in Table 72 might either not be cognate at all (the rhyme and semantics also 
do not correspond well) or it might be due to a universal tendency that MOTHER words starts 
with m. But why would a voiced m rather be denasalised than an unvoiced hm? This is a 
question for further research. It is however reminiscent to the denasalisation in Southern 


34 ‘pus, sap, juice, exudation’ 

35 Note also that the coda in counter-example HAIR (ON BODY) compares to a lateral and not to nasal in many 
Tibeto-Burman languages including Kuki-Chin. Although Matisoff does not list this root as a counter-example 
for *m>b, he derives a?'mun 75 ‘body hair’ from PTB *mil ze *mul in the same paper (Matisoff 2009: 299). This 
is not impossible but assumes that *-/ > *-n before *m > *b. 

36 http://stedt.berkeley.edu/~stedt-cgi/rootcanal.pl accessed on 11/11/2015 
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Loloish Bisu where only plain nasals seem to be denasalised and not nasals in complex onset 
as for example nasals originating from PLB *s-m and *s-n (Matisoff 2003: 39) 

Puzzling is also the fact that of the two inherited m-prefixes (both voiced) one is 
denasalised and the other not: The lexical noun prefix ma- is never denasalised. The negation 
ba- (<*ma-) is always denasalised. The reason might have been that the *m-prefix was 
syllabic or was part of a complex onset *mC while the m of the preverbal negation formed a 
simple onset *ma-C. In modern Bulu Puroik the m-prefix is sometimes pronounced syllabic 
[m] without schwa [o] (spectrogram does not show formants), while the negation always has 
an audible and in the spectrogram visible vowel. This could be the reason why the prefix m 
was preserved (*mC > ma-C) but the negation was denasalised (*ma-C > ba-C). 

Whatever the answer will be, these few basic roots in Table 71 do have considerable 
importance for the classification of Puroik: they are clearly Tibeto-Burman, clearly not Tani, 
Hrusish (Miji, Bangru, Hruso), Bodish or Bodo-Garo, but they are in this form shared with 
the Kho-Bwa languages in West Kameng. This would support any classification of Puroik as 
a Tibeto-Burman language, which is closer related to Kho-Bwa than to Tani, Hrusish or any 
other language in the region. 


6.3. Non-nasal sonorant onsets 


*r, XI. *w correspond in Proto-Puroik and Proto-Kuki-Chin in most plausible cognates 
(Table 75). 


Table 75 - *r, SL *w compared to PKC 


Gloss B KR CT PP PKC Mizo 

SIX ra? rə? rək *rək *ruk ruk 

SHELF (FIRE) rap rap rak *rap *rap ràp 

CANE rii rei rei Frei”) *ruy 3€ hruy brut 

GUTS a-łyi-rin | a-hui-rin | a-lue-rig — *a-lui-rin *ril x rul ril 

WARM a-lam a-lam a-lap *a-lam *lum Xx hlum lum 

PATH lim lim lik *lim *lam lam 

(Saria) 

BOW (lii) lei lei ale") *[ii líi (Hakha Lai) 

HEART a-luN-baa  a-lug-baa a-lok-baa *a-log-bea Sun lung 

TONGUE a-lyi jui (a-rue) | *a-lui *lay léi 

LEECH [pa-]we? | [pa-]wai? ka-waik | *ka-wat *wat 3€ wot Xwut vàng vat 

FART wai? wai WEE *wai? woy? X wey? voy? (Hakha 
Lai) 

DRY a-wueN a-wuan a-wuaik ` Soen *waar váar?? 


6.4. u-“epenthesis” 


Compared to other Tibeto-Burman languages, a number of roots in Puroik have an additional 
u. In most of these cases apparent cognates in PKC have a long aa in a closed syllable (Table 
76). Only PKC *paa MALE/FATHER is not in a closed syllable (onset correspondence is also 


37 ‘white, to be light (not dark). Maybe because colour of clothes, grass, leaves, stones is lighter when they are 


dried. 
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problematic). The rhyme of *a-nui? NEAR, supposed to correspond to the PKC rhyme *-aay 
is unexpected and the comparison is possibly wrong. 


Table 76 - u-epenthesis 


Gloss B KR CT PP PKC Mizo 

SON-IN-LAW" a-bo? | bua? a-bua *bua? *maak maak pa 

BROTHER a-n)  a-nua a-nua *a-nua *naaw nau 

(YOUNGER) 

FAT/GREASE 4-329 a-zjaa a-zua *a-zua  (*thaaw thàu) 

FLOWER a-bueN  hain-buan | ma-buaik *buan *paar páar 

DRY a-wueN a-wuan a-wuaik | *a-wuan ` *waar váar^ 

NEAR a-nyi — a-nui (a-nui) ` *a-nui? ` (*naay S hnaay — hndi-I, hnáih-II) 

FLOW nye nuai rue Tua) *hnaay hnái?? 

MALE a-poo ` a-pua a-pua (*a-pua *paa pa) 

BITTER (a-ffa)) | a-ffua? a-ffjaa (*a-qua®”  khaa-I, khaat zs — khà-I, kháak-II) 
khaak-IT 


Roots corresponding to a short a nucleus in Kuki-Chin do not have the “epenthetic” u. 


Table 77 - no u-epenthesis 


Gloss B KR CT PP PKC Mizo 
SMELL nam nam nay *nam *nam nam 
DREAM baN bay bak *ban *may mang 
LEECH [pa-] | [pa-] ka-wai?  *ka-wat *wat 3€ wot X vàng vat 
we? wai? wut 
KILL [we?] ai? ai? *at that-I, -tha?-1I that-I, thah-II 
CRY (fe?) an jap *fjap krap-l, kra?-II tap-I, tàh-II 
SHELF (FIREPLACE) rap rap rak *rap *rap rap 
FIRE bee bai bee *bai *may méi 


Examples in Table 32 suggest that the rule might also apply to roots with long ii in Kuki- 
Chin. 


Table 78 - u-epenthesis in roots corresponding to PKC *ii? 


Gloss B KR CT PP PKC Mizo 
BELLY (INTERIOR)  a-lyi a-hui a-lue (*a-tui *mik-khlii) mit-thlii (FL)^ 
BLOOD a-hui a-fui a-hue (*a-hui *thii) thí 


38 ‘white, to be light (not dark)’ 

3 ‘pus, sap, juice, exudation’ 

40 “tears’<< EXCREMENT << SHIT<<INTESTINES Formally related roots in e.g. Central Naga all mean ‘shit? PCN 
*a-khlaj?. 
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However, the onset correspondence of both of these two roots is insecure, and furthermore 
in semantically and formally more straightforward cases PKC "ii corresponds to PP *ii (Table 
79). 


Table 79 - Counterexamples to potential ui vs. ii correspondence. 


Gloss B KR CT PP PKC Mizo 

DAY a-nii a-nii a-rii *a-nii *nii ni 

DIE ii ii ii KI thü-I, thi?-II thí-I, thih-II 
PERSON - bii bii *bii *mii mí 


There are further examples where Puroik seems to have an additional u as compared to 
other languages with i, e.g. WOMAN KR a-mui : CT a-mui. However, it is not entirely clear at 
this point whether the Puroik correspondences projected back as *ui and *ua were really 
diphthongs or rather stand for a different vowel quality and the term “epenthesis” would be 
mistaken. For example PP *ua might have to be revised to PP *5 (83.3). Even if not at a 
Proto-Puroik stage, it would make sense to reconstruct the correspondence *ua vs. *aa to *2 
for a common ancestor of Kuki-Chin and Puroik. There are typologically parallel attested 
sound changes for *» > PKC *aa (Brugmann's law in Indo-Aryan^') as well as for (Pre- 
)Puroik *o > ua (Romance breaking?) 


6.5. Proto-Kuki-Chin *th 
Kuki-Chin *// corresponds to s in many Tibeto-Burman languages VanBik (2009, 16-18). As 
for cognates of these roots in Puroik, a handful of examples would be compatible with the 


hypothesis that whatever Kuki-Chin *th goes back to, got lost in Puroik (Table 80). 


Table 80 - Ø to PKC *th correspondence 


Gloss B KR CT PP PKC Mizo 

KILL (we?) ai? aik *at *that-I, -tha?-II _ that-I, thah-II 
DIE ii ii ii Ki *thii-I, thi?-II thí-I, thih-II 
THREE im „im uk "hn  *thum pà-thúm 
DOOR haN-wuiN ha-wun fuk-wuik *wuñ *thun xthan thün? 

CUT (WITHOUT i? i? ii *jp *thi? sâm thi? 
LEAVING ) 

CAVE wu? u? 00 *wo? *thuuk thûuk” 


Other examples with *th in Kuki-Chin, have a wild variety of initials in Puroik (A, f, sj, z, 
Dj. w), as seen in Table 81. 


^! E.g. Proto-Indo-European (PIE) non-oblique (nominative, accusative) kinship suffix *-ter > Sanskrit -tar 
(matar-am *mother- ACC"), but PIE non-oblique agent noun suffix *-tor > Sanskrit -tar (e.g. datar-am ‘giver- 
ACC’) 

? E.g. GATE Latin portam > Spanish puerta, Romanian poarta 

8 “put in’ 

comb’ (n.) 

45 «deep, profound’ 


44 « 
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Table 81 - Other Puroik rhymes corresponding to Kuki-Chin roots with onset *th 


Gloss B KR CT PP PKC Mizo 

BLOOD a-hui — a-fui a-hue *a-hui *thii thí 

NEW (OF a-feN | a-fan a-faik *a-fan *thar thár 

THINGS) 

FIREWOOD JLN hain haen zeien)" *thiy thiy 

FAT/GREASE 4-322  a-zjaa a-zua *a-zua *thaaw thau 

FRUIT JON. hain-wai | rong-wee *wai *thay théi 
WEE 

ITCH ER a-wua a-wua *a-wua [*thak thak] 

LIVER a-pjiN  a-pjin a-pjik *a-pjin *thin thin 


The variety of PP onsets that putatively correspond to PKC *th in Table 81 looks hopeless 
at first sight. But it is remarkable that from all the examples that VanBik (2009, 17) presents 
for PTB *s- > PKC *th- (ITCH, DIE, FRUIT, KILL, LIVER, THREE) all but one (ITCH) have a rhyme 
that is compatible with Puroik. This gives some hope that these strange onsets could 
eventually be explained. Maybe as an interaction between a fricative sound (0?) and a prefix? 
A good candidate is the root for liver *a-pjin. In VanBik's data set there are two languages 
which indeed have a p-prefix: Khumi (Southern Chin) pthueng and Lakher (Maraic) pa-thi. If 
this is old and these two languages go back to something like *p@iy then there is nothing 
strange about Puroik *a-pjin anymore. 20 would just get lost like in the examples in Table 80 
only leaving behind an on-glide to the following i. In a similar way the other examples in 
Table 81 might (or might not) have an explanation with old prefixes. 

The onset of the root for FIREWOOD, however, goes back to a cluster PP *sj which is 
remarkable because this very wide spread root in Tibeto-Burman shows reflexes of simple 
onset *s almost everywhere. This recalls the root for NAME *a-bjen in Table 71 where Puroik 
also has a cluster onset compared to a simple onset in many other TB languages. PKC *min 
‘name’ projected to Puroik should be thin which is not that far from our proposed PP *a-bjen. 
It would be unsatisfying to treat these roots as completely un-related. Since there is no way to 
explain the Puroik forms as innovations, it is possible that Puroik reflects an older stage and 
Kuki-Chin innovated *sjen> PKC *thiy and *mjay > PKC *minx  *hmin. 


Table 82: Archaic roots? 


Gloss B KR CT PP PKC Mizo 
FIREWOOD IN hän haen zeien)" *thiy thiy 
NAME a-bjeN | a-buen a-biey *a-bjen *min X*hmin hming 


6.6. Origin of the lateral fricatives 


The lateral fricative PP */ corresponds to *k(h)/ in Kuki-Chin (Table 83). There is one 
example with a voiceless lateral approximant in Kuki-Chin as well (SHADE), and one example 
with normal voiced lateral in Kuki-Chin (STONE)^*. However, the lateral fricative in Puroik is 
only attested in one dialect (KR). CT has a different root (kob.aa) and Bulu has a normal 
lateral like Kuki-Chin. 


46 VanBik's reconstruction for PKC might have to be revised to *A/u since in the conservative Northwestern 
Kuki-Chin languages Monsang and Anal the word for stone hlung has a voiceless onset as well (Linda Konnerth 


p.c.). 
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Table 83 - *1 to *k(h)l correspondence 


Gloss B KR CT PP PKC Mizo 
GHOST ma-lao  mo-lau | mo-laa *mo-laa *khlaa thlaa 
FALL lu? lu? (jok-lo) ` Sol) kluu-I, kluuk-II.— tlà-I, tlàuk-II 
MARROW  (a-yiN) a-hin a-lin *q- lin *khlik »«khlin ` thling 
BELLY a-lyi a-hui a-lue *a-lui *mik-khlii mit-thlii (FL) 
(INTERIOR) 
SHADE a-lim a-him a-lap *q-lim *hli(i)m hlim 
STONE (ka-lin)  ka-hug - *ka-lun®  *luņ lin 
(ka-tun) 


6.7. Summary of suggested correspondences with Kuki-Chin 


Table 84 - Summary of suggested correspondences with Kuki-Chin 


B KR CT PP PKC # Table 
-2 -? -Ø SA *- p» 4 59 
-? -? -Ø Kä SE 3 60 
-? -? -k SE SE 2 61 
-? -i? -ik T Wi 4 61 
-P -P -p/k TD *-p 2 6l 
H -y 4 *4 *y 2 62 
-ViN -n -in *-n *-n/n 4 62 
-ViN -n -in *-n */] 2 64 
-m -m -m *-m *-m 3 62 
sp E] -k TH *y 2 63 
-ViN -n -ik TH *-n 4 63 
-ViN -n -ik TH *y 6 65 
-m -m -p/k *-m *-m 6 63 
-ii -ii -ii Wäi *-jj 3 66 
-V(i) -Vi -V(i) *-yi TD 0 67 
-o(C) -ua(C)  -ua(C) *-uaC  *-aaC 9 76 
-aC -aC -aC Sat Sat H 77 
k- k- k- *k- *kh- 4 68 
g- g- g- E *k- 3 69 
d- d- d- *d- Kë 1 69 
b- b- b- *b- *p- 2 69 
b- b- b- *b *m- 6 71 


47 “tears’<<EYE SHIT << SHIT<<INTESTINES Formally related roots in e.g. Central Naga all mean ‘shit? PCN *a- 
khləj? (Bruhn 2014) 
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B KR CT PP PKC # Table 
m- m- m- *m- *hm- 3 72 
n- n- n- *n- *n- 3 73 
n- n- n- *n- *hn- 3 73 
n- n- r- *ý- *n 1 73 
n- n- r- *ü- *hn 1 73 
r- r- r- *p- *p- 4 75 
l- l- l- *[- *[- 5 75 
w- w- w- *w- *w- 3 75 
GL GL GL Të *th- 6 80 
ł- h- L TL *k(h)- 4 83 
i- h- i- TL *hl- 2 83 


7. Lexical evaluation 


In §5 and §6 it was assumed that a certain number of Puroik roots have cognates in Kuki-Chin 
and an attempt was made to identify regular sound correspondeces. But how big and 
significant is this set of words with cognates in Kuki-Chin or elsewhere in Tibeto-Burman? Is 
it a negligible set or is indicative of a phylogenetic affiliation? 


7.1. Assessing the percentage of Tibeto-Burman basic roots in Puroik 
The roots which were assumed to have cognates in Kuki-Chin are repeated here (Set 1)**: 


Set 1 

1SG *goo : *kay3« *kayma? 

2sG *nag? : *nay 

IPL *ga-nei : Mizo keini ‘we (exclusive) 
2PL *na-ńei® : Mizo nangni>° 

BOW *lei® : *lii 

BREAST (FEMALE) *a-njai : *hnooy 
BROTHER (YOUNGER) *a-nua : *naaw 
CANE?! *rei ` *ruy xX *hruy 

CHILD *a-dou” : *tuu 

CRY “jap” : *krap-I, *kra?-II 

DAY *a-nii : *nii 

DIE *ii : *thii-I, *thi?-1I 

DIG *fo? : *tsaw, *tso?-IT 

DREAM "bar, : *man 

DRINK *in : Mizo ín” 


949 


48 Gloss PP : PKC. The list excludes SUN PP *a-nii which is cognate with PKC mutt. But it is arguably the same 
root as DAY and is not counted twice. 

? Marrison (1967) 

50 Marrison (1967) 

3! CANE replaces ROPE in the Leipzig-Jakarta and Swadesh 200 list, because the proto-typical thing to tie 
something in the Puroik culture is a cane rope. 

52 Weidert (1975) 
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EAR *a-kun : *khur 5€ khor? 
EXTINGUISH *bit : *mit 

FALL (FROM A HEIGHT) *łuk®? : *kluu-I, *kluuk-II 
(FART)? wai? : *woy? X *wey? 
FATHER/MALE *a-pua : *paa 

FIRE *bai : *may 

FLOWER *buan : *paar 

GHOST?? *ma-taa : *khlaa 

GUTS?? *q-lui-rin : *ril 3<*rul 

HAIR (ON BODY) *a-mun; *mul x *hmul 
HAND/ARM “*a-gat : *kut 3€ *khut 
HEART *a-loy-baa : Sun 

{HOLD IN MOUTH} *mom : *hmoom 
KILL *at : *that-I, *-tha?-II 

LEECH *ka-wat : *wat xX wot X wut 
LIVER *a-p-jin : *thin 

MARROW *a-lin : *khlik x *khlin 
NEAR *a-nui? : *naay X *hnaay 
NEGATION *ba- : *ma- 

PATH *lim : *lam 

PERSON "bii : *mii 

{PILLOW} *kog-kom) : *kham x  *khum 
RIPE/WELL-COOKED *a-min : *hmin 
{SHELF (OVER FIREPLACE)} *rap : *rap 
SHADE *a-lim? : *hli(i)m 

SIX *rok : *ruk 

SKIN *a-ku?? : *khok-I, *kho?-II 
SMELL *nam : *nam 

SMOKE *bai-Kkii? : *may-khuu 
SON-IN-LAW *bua? : *maak 

STONE *ka-luy : *luy 

THREE "iri? : *thum 

TWO "ni? : *ni? X *hni? 

WARM *a-lam : *lum X hlum 


Subsets of this list make at least 15 percent of every commonly used word list in lexico- 
statistics (Table 85). 7 of them are among Matisoff's (2009) 12 supposedly most stable 
Tibeto-Burman roots (58.5%), 17 in his most stable 47 roots in TB (36%). 


3 ‘hole, pit, cavity’ but ‘ear canal’ in Sorbung, Moyon, Kom Rem (Northwestern Kuki-Chin, previously “Old 
Kuki") 

4 Roots in {} are in no basic wordlist. 

55 For SOUL in Sun's list. 

56 For INTESTINES. 
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Table 85 - Cognacy percentage using different basic wordlists 
Set 1 Set 2 (controversial) Set 3 (Other TB) 


Leipzig-Jakarta 100?" 18% 24% 36% 
Swadesh 100° 19% 26% 40% 
Swadesh 200?? 15.5% 22.5% 31.5% 
CALMSEA 200° 18.5% 25% 33.5% 
Sun 200° 18.5% 26% 35% 
Stable 47° 36% 42.5% 53% 
Stable 12° 58.5% 66.5% 66.5% 


Set 2 (below) contains items of §5 and $6 excluded from set 1 because of formal or 
semantic problems. 


Set 2 

ALIVE *a-sen : *hrig-I, *hrin-II (onset) 

BELLY (INTERIOR) *a-tui ` *mik-khlii (semantics ‘intestines’ vs. *tears") 
BELLY (EXTERIOR) *a-fui-buy; *pum (irregular coda correspondence) 
BITTER *affua? ? : *khaa-I, *khaat x *khaak-II (onset) 

BLOOD *a-hui” : *thii (onset) 

CUT (WITHOUT LEAVING THE BLADE) Sid : *thi? (semantics ‘cut, saw’ vs. ‘comb n.") 
DOOR *wun : *thun X *than (semantics ‘door’ vs. “put in, infuse’) 
DRY *a-wuan : *waar (semantics ‘dry’ vs. ‘white’) 

FAT/GREASE *azua® : *thaaw (onset) 

FIREWOOD *sjen : *thir (onset) 

FINGERNAIL *ge-sin : *tin (onset) 

FLOW *nuai : *hnaay (semantics ‘flow’ vs. ‘juice’) 

FRUIT *wai ` *thay (onset) 

MORTAR *fuy-fəm® : *shum (onset) 

NEW *a-fan : *thar (onset) 

OLD (OF THINGS) a-ffjan : *tar (onset) 

SWELL *pan ` *puar (onset and nucleus) 

TONGUE “*a-lui” : *lay (onset and nucleus) 


If all or almost all comparisons in set 2 would turn out to be correct, then the percentages 
of cognate roots would go up to around 20-25% depending on the basic word list. Some 
comparisons in set 1 and set 2 may indeed turn out to be untenable. On the other hand once 
the phonological history of both branches is better understood other not-included roots might 
be relatable. 

As for now, the cognacy percentages obtained are somewhat lower than what Sun (1993, 
353) found when he compared Proto-Tani with Written Tibetan (28-29%), Garo (24-26%), 
Written Burmese (27-28.5), Taraon (29.5-38%), Kaman (21.5-25%), Miji (26-29%) and 


57 Tadmor (2009) 

58 Swadesh (1971) 

5 Swadesh (1971) 

6 Culturally Appropriate Lexicostatistical Model for SouthEast Asia (Matisoff 1978) 

$! List of words reconstructible for Proto-Tani. Based on CALMSEA replacing 37 items (Sun 1993: 352). 
€ The 47 top most stable roots in Tibeto-Burman (Matisoff 2009) 

$83 The 12 top most stable roots in Tibeto-Burman (Matisoff 2009) 
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Lepcha (23.5-24.5%). Sun's list is tailored to roots which are reconstructible for Proto-Tani 
replacing 37 items (18.5%) from a CALMSEA list. 18.5% new items could seriously 
compromise the outcome, which might have been the reason why Sun chose roots which are 
quite unique to Tani. Discounting them would not change any of his results by more than 3%. 
However, the relatively high within-group diversity of Puroik is indeed one reason for the 
lower percentages. If I was asked today to exclude the items from a CALMSEA list which are 
not reconstructible for Proto-Puroik, it would be around 50-60 (25-30%), significantly more 
than Sun had to exclude™. Basic roots like for example Bulu wa? ‘pig’ (Mizo vawk), Bulu lja? 
‘lick’ (Mizo /iak-I, liah-II ) or CT existential copula wee (Daai-Chin ve (So-Hartmann 2008)) 
do have good cognates in Kuki-Chin, but these items are not or not yet reconstructible for 
Proto-Puroik and were hence excluded in these counts. 

There are other common-Puroik roots which may have cognates in Tibeto-Burman 
languages (other than Kuki-Chin and Kho-Bwa). To name those mentioned in the course of 
this paper: 


Set 3: BIRD *pa-dou, BITE "tua, BLOW *fu?, EAT *ffii, FOUR *vjei, FEMALE/MOTHER *a-mua, 
FROG *ra?, HEAD *a-korj, LEAF *ljap, LEG/FOOT *lai, LONG *a-pjay, MAN (MALE) *a-foo, NAME 
*a-bjen, NOSE *a-poy, PENIS */ua?, SCRATCH *bju?, THAT *tai, TOOTH *ka-tuay, VOMIT *muai? 
, WATER *kua, WEAVE "rua? 


Including these items as Tibeto-Burman basic vocabulary in Puroik, the percentages come 
to lie around 30% (or 40-50% if not reconstructible items would be left out of consideration). 
It is interesting to see that the percentage is much higher in Matisoff's two lists (Stable 12 and 
Stable 47), which he considers to be the most stable roots in Tibeto-Burman. It seems that the 
more basic the word list is, the higher the percentage of Tibeto-Burman roots in Puroik 
becomes. The same pattern can be seen comparing the percentages of the Swadesh 200 list 
with the more basic Swadesh 100 list. 


7.2. Could the Tibeto-Burman roots in Puroik be borrowings? 


Blench and Post (2014) have expressed concerns, with explicit reference to Puroik, about 
classifying languages just based on a few lexical items, while the remaining 9996 percent of 
the lexicon is left out of consideration. They maintain that these few roots are not necessarily 
indicative of a genetic affiliation and could potentially be explained as borrowings from a 
Tibeto-Burman language: 


As is well known from voluminous research in contact linguistics, common features in a geographical area 
are far from proof of genetic affiliation; while it may well be the case that an armchair glance at, say, a 200- 


$^ This number is likely to become lower once more data will be available and the historical phonology is better 
understood. At the current state of my data and knowledge from Matisoff's (1978) original list: A. Bodypart 
(10/39=25.5%): EGG, (HORN), SPIT, TAIL, (FINGER/TOE), BRAIN, NAVEL, SHIT, PISS, SNOT; B. pronouns/kinship 
terms/nouns referring to humans (0/7=0%); C. Foodstuffs (3/5=60%): PEAS/BEANS, POISON, 
PLANTAIN/BANANA, MEDICINE/JUICE; D. Animal names and animal products (15/18=83%): MEAT/ANIMAL, 
DOG, FISH, LOUSE, SNAKE, INSECT, BEE, DOVE, MONKEY, PIG, FOWL, (OTTER), HORSE, (BEAR), RAT/RODENT; E. 
Natural objects (9/33=27.5%): EARTH, GRASS, (MOON), MOUNTAIN, RIVER, (SALT), STAR, SILVER, IRON F. 
ARTIFACTS (4/7=57%): ARROW, NEEDLE, BOAT, (VILLAGE); G Spatial/directional (0/5=0%); H. Numerals 
(3/13=23%): ONE, HUNDRED, MANY; I. Verbs of utterance, body position of function (2/9=22%): BE BORN, 
SIT; J. Verbs of motion (3/7=43%): CLIMB, DESCEND, EMERGE; K Verbs of emotion (1/8=12.5%): 
FEAR/FRIGHTEN; L. Stative verbs with human patients (1/8=12.5%): (FAT); M. Stative verbs with non-human 
patient (2/18=11%): RED, SHARP; N. Action verbs (11/26=42.5%): STEAL, TIE, LICK, COOK/BOIL, GRIND, WASH, 
LET GO/SET FREE/LOOSEN, RUB, SQUEEZE, PUT/PLACE, DRIVE/HUNT. Total (56-64).The most divergent semantic 
field is D. Animal names and animal products. 
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item Puroik (Sulung) wordlist yields greater-than-chance resemblances among certain forms and parallel 
items in well-known Tibeto-Burman languages (the usual suspects tend to be ‘fire’, ‘sun/day’, ‘person’, 
‘two’ and ‘three’, the first and second person pronouns, and a handful of other common forms), it is wrong 
to discount the possibility that such forms could have come about via contact and borrowing. (Blench and 
Post 2014) 


Blench and Post are probably not alone with this suspicion since Puroik is widely 
considered to be very divergent. For example the language catalog Ethnologue® describes 
Puroik as *a divergent language which may not be Sino-Tibetan but possibly Austro- 
Asiatic®.” It is, of course, a highly relevant question, whether the Puroik roots with cognates 
in Kuki-Chin could be loans from somewhere. If they are all loans from different sources, 
searching regular sound correspondences with Tibeto-Burman languages would be a 
meaningless exercise. 

As for Kuki-Chin, a direct borrowing situation, which could explain whatever is shared 
between the two groups, is unthinkable.®’ If the roots in Set 1 were borrowings, they must 
have come from another Tibeto-Burman language into Puroik. But which? It must have been 
a language where the word for FIRE has a plosive onset something like PP *bai not something 
like Miji-Bangru mai or Proto-Tani *mo. Furthermore that language must have had relatively 
well preserved codas. The only possible candidates in the region are Bugun, Sartang, 
Sherdukpen and Khispi-Duhumbi in West Kameng, which are in no way more canonical 
Tibeto-Burman languages and the question of their affiliation is the same as for Puroik. But, 
for the argument's sake, suppose that there was a suitable language or the phonological 
questions could be accounted for somehow as innovations after the borrowing. How likely is 
it really that basic words like FIRE, SUN and pronouns were borrowed — per hypothesis not 
only once but from one language to another? Does it commonly happen in the languages of 
the world or is it a rather strong assumption? 

The World Loanword Database (WOLD)*® was a typological project at the MPI Leipzig 
which aimed at answering and quantifying these kinds of questions by computing a 
borrowability and an age index for a list with 1400 “meanings”. The sample of 41 languages 
from different phyla and continents includes 6 languages whose lexicon was found to contain 
more than 40% borrowings, 2 languages with more than 50% borrowings and one language 
with more than 6096 borrowings. Averaging the judgements of experts, FIRE scores the fourth 
lowest possible average borrowing index of 0.04. It is assigned 1 “clearly borrowed” in one 
language (Indonesian), 0.25 “very little evidence for borrowing” (Hausa) all other languages 
were given a 0 “no evidence for borrowing". As for the age index FIRE scores highest of the 
whole database, i.e. of 1400 meanings in question FIRE has in average the longest attested or 
reconstructible history within a language. In 34 out of 41 languages the word for FIRE is 
considered to score more than 0.9 which means "first attested or reconstructed earlier than 
1000". As for the whole Set 1 (46 roots), 41 of the 44 (93%) of the roots, which are in Set 1 
and in the WOLD have an average borrowing index between 0 and 0.25. All meanings in Set 
1 score (if included in WOLD) score an age index of more than 0.8, meaning that in average 
they have a traceable history of more than 200 years. The pronouns 1SG, 2SG, 1PL, BROTHER 
(YOUNGER), FIRE, NEGATION, SON-IN-LAW and FART are in the top 5% percent of items not 
being borrowed, 14 items of set 1 are in the top 5% percent as for the age index. 

The WOLD shows that some words are borrowed easier and typologically more often 
than others and that it makes sense to work with basic wordlists in historical linguists. The 


$85 https://www.ethnologue.com/language/suv accessed on 13/11/2015 

$6 For which I could not find any evidence. 

$7 The fastest distance from Seppa to Aizawl is 165 hours (~2 weeks) on foot according to Google (using the 
bridge over the Brahmaputra in Tezpur). 

68 http://wold.clld.org accessed on 13/11/2015 
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WOLD does not suggest that there are unborrowable words. Even the word for FIRE can be 
borrowed under very special circumstances like in Indonesian, which a thousand years ago 
was under the strong influence of a language and a culture where the fire is a deity and has to 
be addressed respectfully in a “holy” language? But that a word like FIRE wanders around 
like the words for TEA, RADIO and BUS are known to wander from one language to another is 
very hard to imagine. It is possible in principle but very unlikely, and not without a very 
extraordinary socio-linguistic setting. Borrowing of core vocabulary is a strong hypothesis 
which has to be shown by giving a specific source and contact situation. 

The social setting in the Puroik area is indeed extraordinary. Nyishi dialects and Miji- 
Bangru indeed have a very strong influence on Puroik. But the Tibeto-Burman roots in 
question are neither Nyishi nor Miji-Bangru and there are no other extant potential Tibeto- 
Burman donor languages. Since we are dealing with core vocabulary which is not usually 
borrowed, assuming “mass” borrowing from unknown sources under unknown but extreme 
circumstances is not a satisfying solution. 


7.3. Characteristic roots 

The diffusion scenario for Puroik has another problem. If the Puroik languages were a 
coherent but non-Tibeto-Burman group, it must have a considerable core of basic vocabulary 
shared by all dialects, which is not Tibeto-Burman. Some of the best candidates are in Table 


86. 


Table 86 - Characteristic roots 


Gloss B KR CT PP 
CLOTH Er ai? aik *at 
CUT (BY HITTING WITH DAO) peN pan paik *pan 
GIVE taN tay tay *tay 
SLEEP ram ram ram *rom 
BONE a-zeN a-zan a-zaik *a-zan 
EYE a-kam a-kam a-kak *a-kom 
KNIFE (MACHETE) fii Yee Yee *fee 
KNOW deN dan daik *dan 
CAN mueN muan muaiy *muan 
MOUTH a-sam a-sam a-sak *q-sam 


However, the greater part of roots common to all three dialects, also seems to be relatable 
to Tibeto-Burman. The supposedly non-TB roots are not shared by all dialects (for example 
the roots in Table 3). Given this situation, it is much more likely that these roots were 
borrowed from different substrates into Puroik, which was Tibeto-Burman in the core, than 
that Tibeto-Burman core vocabulary diffused through several unconnected languages. 


7.4, A possible scenario 


The idea that the Tibeto-Burman vocabulary in Puroik is borrowed was rejected in §7.2 
because I believe that borrowing of (hard to borrow) core vocabulary needs to be accounted 
for by describing a specific source and a specific contact situation. If there is no plausible 


© Indonesian pawaka < Pali pavaka. The Pali word means ‘pure, bright, clear, shining’ in the first placed (for 
holy things) and is the name of a particular fire god in the second place. The normal Pali word for fire is aggi. 
(http://dsal.uchicago.edu/dictionaries/pali/ accessed on 19/10/2015) 


275 


North East Indian Linguistics 7 


source or contact situation, the default hypothesis is that core vocabulary is inherited. It is 
more likely that eventual non-TB roots” might be borrowings, possibly from autochthonous 
substrates. Nowadays the Puroik are a marginalised tribe. How could the language of a so- 
called “hunter-gatherer” tribe spread over an area which takes around 2 weeks to cross on 
foot? Why would the Puroiks be linguistically dominant in this whole area over non-TB 
groups? 

A possible answer could be that the Puroiks were not always the lowest in social 
hierarchy. The way in which they might have been technologically advanced as compared to 
hunter-gatherer groups is the sago culture which is common to all Puroiks, even today. Sago 
production has immense advantages compared to day-to-day food gathering in the forest. A 
Puroik family is able to produce food for more than a week in one day, which gives people a 
lot of freedom for hunting and other pursuits. In terms of food security sago production is at 
least equivalent, in my opinion, even much superior to rice cultivation, since sago palms are 
hardly affected by floods, hailstorms, droughts, pests and bamboo rats. And although sago 
flour can be stored for many months if needed (e.g. for journeys and hunting expeditions), it 
is not necessary to look after delicate granaries, since sago can be harvested any time of the 
year. 

The sago palm varieties containing enough starch for rewarding exploitation (probably 
Metroxylon varieties) do not grow wild in the jungle but are cultivated in plantations in 
designated areas near the processing places. According to Puroik origin stories, the Puroiks 
were the tribe who brought this kind of sago palms to Arunachal Pradesh. Although regarded 
as primitive by rice cultivating tribes, a sago based livelihood has nothing in common with 
random food gathering in the forest. It is a kind of forest agriculture involving many non- 
trivial cultural techniques, which are passed on from one generation to the next. The 
knowledge of how to make efficiently compact and storable food from the trunk of a tree, 
might have given the Puroiks the superiority to dominate this whole area culturally and 
linguistically once upon a time marginalising and eventually assimilating autochthonous 
hunter-gatherer groups, who contributed the scattered non-TB words to the different Puroik 
dialects. Roots for meanings related to sago are indeed reconstructible on basis of all Puroik 
dialects, and it would be interesting to see if they are even relatable to the words in other sago 
producing Tibeto-Burman tribes (e.g. Nungic Trung in the Trung valley). 

Even if everybody agrees that the Puroiks were already in Arunachal Pradesh when other 
better known Tibeto-Burman tribes started to spread, this does not mean that the Puroiks must 
have been the very first humans who ever settled in Arunachal Pradesh. They might have 
migrated from elsewhere as well, and thanks to their supreme sago processing technologies 
become the culturally and linguistically dominant group, until they were marginalised by 
tribes who needed land for rice cultivation. This scenario could account for the linguistic data 
sufficiently well, it does not assume any extraordinary borrowing situations or past socio- 
cultural settings, which cannot be inferred from the present. The only truly controversial thing 
about it is that a Tibeto-Burman expansion, which was maybe not even among the first ones, 
would not be an expansion of rice cultivators but of sago cultivators. Population geneticists, 
botanists and anthropologists might know or might find out whether this a possibility. If not, 
any more plausible socio-historical scenario which can account for the Tibeto-Burman part as 
well as for the non-Tibeto-Burman part of Puroik could provide additional welcome input for 
the discussion. 


7 I am not claiming that the roots that I identified as Tibeto-Burman are the only ones and all other roots are not 
TB. It is likely that some roots have cognates in other branches than Kuki-Chin or even in Kuki-Chin, and the 
relevant phonological and morphological processes (affixes) are not discovered yet. 
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Figure 2 - Sago making in Sanchu (Chayangtajo) 


8. Summary 


This paper has shown that the three selected Puroik 'dialects' are lexically quite diverse, at 
least comparable to the diversity in the whole Tani group. However, there is no reason to 
doubt the first claim made in the introduction, that the Puroik dialects form a coherent group. 
The correspondence of syllable initials in the three dialects is straightforward. Syllable initial 
plosives (*k, *t, *p, *g, *d, *b), nasals (*m, *n) and sonorants (ht, *r, *], *w) can be 
reconstructed for Proto-Puroik. Fricatives are less well attested and were provisionally 
reconstructed as *h, Ze, *z, SC XN. For vowels the situation is also less clear, but the short 
rhyme correspondences *a, Zo (83.1) and the open rhyme correspondences named "ii, *ua, 
*ai seem secure (83.2). Coda consonants fall into three correspondence classes (§4.1—4.3): 
nasals corresponding across all three varieties (*-y, *-n, *-m), stops everywhere (*-?, *-k, *-f, 
*-p) and nasal-stop with nasal in B and KR and stop in CT (*-7, *-n, *-m). Although there are 
numerous examples, the question about what the nasal-stop class has to be reconstructed to, 
had to be left open. 

As for identifying regular sound correspondences with other Tibeto-Burman languages, 
the second aim of this article, the degree of certainty is much lower. A first preliminary 
comparison with Kuki-Chin was undertaken, from which the following patterns emerged (in 
brackets number of examples): 


Puroik nasal codas correspond to Kuki-Chin nasal codas. (9) 

Puroik stop codas correspond to KC stop codas. (15) 

The Puroik nasal-stop codas correspond to Kuki-Chin nasal codas. (12) 
Kuki-Chin r-codas correspond to the Puroik alveolar nasal-stop coda. (6) 

The Kuki-Chin /-coda corresponds to the Puroik alveolar nasal-nasal coda. (2) 
Kuki-Chin aspirated plosives correspond to unvoiced plosives in Puroik. (4) 
Kuki-Chin unvoiced unaspirated plosives correspond to voiced plosives in Puroik. (5) 
Simple onset *m in Kuki-Chin corresponds to Puroik b (6) 

. voiceless *hm to Puroik m (3) 

10. Sonorants r, l, w correspond (4/5/3) 

11. Puroik ua corresponds to long aa in Kuki-Chin in closed syllables (9) 


Ne 9o E UR XL 
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12. simple a corresponds to short a in closed syllables (7) 
13. what is reconstructed as *th in PKC is lost in Puroik. (6) 


Even if the correspondence patterns (1) — (13) may not considered to be proven with 
enough examples at the current stage of research, they form a set of testable working 
hypotheses, which can in principle be empirically falsified with standard procedures of the 
comparative method. 

A lexical evaluation of the compared roots in $7.1 revealed that the percentage of Tibeto- 
Burman basic vocabulary lies between 15 and 30 percent of commonly used basic word lists, 
at the present state of knowledge. It was argued that this 1s in a similar range like Tani. The 
non-Tibeto-Burman part of Puroik could be explained as influence from substrates. In $7.2 a 
scenario was sketched of how the language could have spread and been influenced by 
substrates in the past. 

Puroik is still a linguistically quite unexplored group of languages with a considerable 
diversity and a fascinating history. The provisional reconstruction of Proto-Puroik will have 
to be improved by documenting and including more data and data from more Puroik dialects. 
In particular the dialects to the east of Chayangtajo in Kurung-Kumey, and the dialects which 
are geographically between Kojo-Rojo and CT, like the varieties Sario-Saria and the dialects 
of Rawa, Poube and other villages, which seem to have some archaic properties. The hitherto 
published data of the eastern dialects and this progress report are nothing but the first initial 
steps in the linguistic research on Puroik. 


Symbols and abbreviations 


* reconstructed protoform or segment | Cr syllable final consonant 

dass hypothetical form, not attested CT Puroik dialect of Chayangtajo 
M speculative proto-form FL Falam Lai (Central Chin) 
.23€.. allofam (in PKC reconstructions) HL Hakha Lai (Central Chin) 

? form is missing INTR intransitive 

- form is not cognate (and ommitted) | KR Puroik dialect of Kojo and Rojo 
Lal form is not cognate (but in brackets) | PKC ` Proto-Kuki-Chin 

Ges) at least one segment not regular P plosive 

a>b a becomes b (formally) PP Proto-Puroik 

a>>b d becomes b (semantically) STEDT Sino-Tibetan Etymological 

Ø absence of any phoneme Dictionary and Thesaurus 

# number of examples TB Tibeto-Burman 

B Puroik dialect of Bulu V vowel segment 

C consonant segment 

Ci syllable initial consonant 
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Appendix: Comparative glossary 
Structure of the entries: 


GLOSS Bulu : Kojo-Rojo : Chayangtajo < *Proto-Puroik or *? (other Puroik dialects); 
*Proto-Kuki-Chin > Mizo (or other Central Chin form); comments about form and 
semantics 


[...] form is not relatable with the correspondence patterns proposed in this paper!, (...) 
probably cognate but corresponding irregularly in at least one segment, *? no plausible proto- 
form available yet 


If not mentioned otherwise the Proto-Kuki-Chin reconstructions and Kuki-Chin language data 
are from VanBik's (2009) dissertation accessed over STEDT?, with following orthography: v 
high tone, v low tone, v rising tone, v falling tone. 


‘which does not necessarily mean that the form is non-cognate, but that no explanation could be found yet how 
to relate it to the other forms. 
*http://stedt.berkeley.edu/~stedt-cgi/rootcanal.pl accessed on 11/11/2015 
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1SG guu : goo : goo « *goo; (*kay 3€ 

*kay-ma? > kéi enclitic ka (Marrison 

1967)) 

2sG naa : (nay) : naa < *nay); *nag > 
náng 

3SG vee : wai : wee < *vai 

1PL (g-rii) : ga-nii : g-rei < *ga-nei”); *? 
> keini ‘we (exclusive)’ (Marrison 
1967) 

2PL (na-rii) : na-nii : na-rei < *na-nei"); 
*? > nangni (Marrison 1967) 

IDU ge-se-ni?/ (ga-he-ni?) : ga-se-nii : gə- 
se- nii < *ga-se-nir) 

IMPF -na : -na : -na < *-na 

PRETEMPORAL -ryila : -ruila : -ruila < *- 
ruila 

ONE /tyi] : [kjuu] : [hui] « *? 

TWO ni? : (nii) : nii < *ni?; *ni? X *hni? 
> hnih 

THREE im : sim ` wk < *im?; *thum > pà- 
thum 

FOUR vii ;: waei : waei < *vaei; (*lii > pà- 
li) 

FIVE wuu : woo : wuu < *woo?; (*yaa > 
pa-nga) 

SIX ra? : rə? : rək < *rak; *ruk > rük 

SEVEN rn2-ljee : jei : ljee < *mo-ljai; [*sa- 
ri? > pa-sa-rih] 

EIGHT ma-ljao : jau : (laa) < *ma-ljaa : 
[*riat : pa-riat] 

NINE duNgii : dungiee : dongaee < *doy- 
gjeeO; [*kua > pa-kiia] 

TEN such : fuan : suaik < *suan?; [*soom 
> sàwm] 

ABOVE a-ffaN : a-ffjay : a-fuay < *a- 
fuay” 

ALIVE a-seN : a-san : a-sik < *a-sen; 
(*hrig-I, *hrin-II > hríng-1, hrin-1I) 

ANT (djamdgu?) : gamgau? ` g4engao < 
*ojamgjo? 

AWAKEN (INTR) 3a0 : zau : jaa < *jaa 

BAMBOO (EDIBLE) ma-bjao : ma-baau : 
ma-biaa < *ma-bjaa 

BEFORE bui : bui : bue < *bui 

BELLY (EXTERIOR) a-lyi-buN : hui-buy : 
a-lue-buk < *a-lui-bug; (*pum > pum) 

BELLY (INTERIOR) a-lyi : a-hui : a-lue < 
*q-tui; (*mik-khlii > mit-thlii (FL) 
“tears’); TEARS << EXCREMENT << 
SHIT << INTESTINES Formally related 
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roots in e.g. Central Naga all mean 
‘shit? PCN *a-khlaj? (Bruhn 2014) 

BIRD pa-duu : pa-doo : pa-dou < *pa-douO 

BITE (99 : tua : tua < *tua 

BITTER a-fa? : a-ffua? : a-ffjaa < *a- 
Yuar”); (*khaa-I, *khaat 5€ *khaak-II 
> khà-I, kháak-1I) 

BLACK a-hjeN : a-hjei : a-hjé < *a-hjai; 
the reason for the nasalisation of this 
root is unclear 

BLOW fuu : fuu : (fuk) « *fuu 

BLUE a-pii : a-pii : a-pii « *a-pii 

BLOOD a-hui : a-fui : a-hue < *a-hui®; 
(*thii 7 thi) 

BONE a-zeN : a-zan : a-zaik < *a-zan 

BOW lii : lei : lei < *lei(?); *lii > lii (HL) 

BRANCH a-kjee : hain-kaei : haen-kiee < 
*kjai 

BREAST (FEMALE) a-njee : a-njei : a-njee < 
*a-njai; *hnooy > hnóoy (FL) 

BREATHE 3uu : 3uu : joo < *joo 

BRIDGE (NOT HANGING) Ka-tyiN : ka-tun : 
ka-tuik « *ka-tun 

BROTHER (YOUNGER) a-7120 : a-nua : d- 
nua « *a-nua; *naaw > nau 

BURN (TRANSITIVE) rii : rii : rii < *rii 

CAN much : muan : muaiy < *muan 

CANE rii : rei : rei < *rei; *ruy X hruy > 
hri 

CAVE wu? : u? : o0 < *wo?; (*thuuk > 
thiiuk ‘deep, to be profound’) 

CHICKEN /ffa?] : [takjuu] : [sakuu] 

CHILD a-daa : a-doo : a-dou < *a-dou”); 
*tuu > tü-té ‘grandchild, children's 
children’. It is not uncommon that the 
child and grandchild generation 
intersect agewise in village societies. 

CLOTH e? : ai? : aik < *at (Rawa at) 

CRAZY a-bjao : a-baaa : biaa-bo < *a- 
bjaa 

CRY (ffe?) : fap : gon *ifjap?; *krap-I, 
*kra?-II > tàp-I, tàh-II 

CUT (HIT WITH DAO) peN : pan : paik « 
*pan; | *tan > tán ‘chop or cut off, to 
amputate, to cross (river, road, hill 
etc)'] 

CUT (WITHOUT LEAVING THE BLADE) i? : 
i? : ii < *i?; (*thi? > sâm thi?) ‘comb 
n.’; formally perfect but semantically 
far 
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DAY a-nii : a-nii : a-rii < *a-nii; *nii > ni 

DIE Zi : ii: ti < i *thii-[, *thi?-II > thi-I, 
thih-II 

DIG ffu? : fu? : foo < *ifo?; *tsaw, *tso?-II 
> chó-I, chàwh-II 

DO/MAKE /fsa?] : [30u] : [kaik] 

DOOR AaN-wuiN : ha-wun : ffuk-wuik < 
*HOUSE-wun, (*thun X *than > thun 
*put in (to anything long and narrow, 
such a bottle, bamboo, pocket, etc), to 
load (as gun)’); semantically far 
formally ok 

DOWN buu : buu : buu < *buu 

DREAM baN : bay : bak < *bay; *may > 
mang 

DRINK in : in : [rig] < *in; *? > in 
(Weidert 1975) 

DRY a-wueN : a-wuan : a-wuaik « *a- 
wuan; *waar > vaar ‘white, to be light 
(not dark)’; DRY >> LIGHT like the 
colour of dry stones, clothes, grass, 
leafs, bones. 

EAR a-kuiN : a-kun : a-kuik < *a-kun; 
*khur 5€ khor > khur ‘hole, pit, cavity’ 
but ‘ear canal’ in Sorbung, Moyon, 
Kom Rem (Old Kuki) 

EAT fii : fii : fii < *ffii 

EXTINGUISH (INTR) /ge?] : bi? : bik < *bit 
(Rawa bit); *mit > mir mi?-II 

EXISTENTIAL COPULA /wee : wai] : wee; 
means ‘exist, have’ in CT and ‘not 
exist, not have’ in KR and Bulu; Dai 
Chin ve ‘exist’ (So-Hartmann 2008) 

EYE a-kam : a-kam : a-kak < *a-kam; *mik 
mit 

FALL (FROM A HEIGHT) /u? : hu? (lu?) : 
ljok-lo < *luKO; *kluu-I, *kluuk-II > 
tlù-I, tlüuk-II ‘fall down (not from a 
height) 

FART wai? : wai : wee < *wai?; *woy? 3€ 
*wey? > voy? (HL) 

FAR a-ffoi : a-fai : a-tfjee < *a-ffuai? 

FAT/GREASE 4-322 : a-zjaa : a-zua < *a- 
zua C; (*thaaw > thàu) 

FATHER see MALE 

FEMALE/MOTHER 4-99 : a-mua : a-mua 
< *a-mua; [*maw > mo ‘bride, a 
daughter-in-law, a sister-in-law, a 
brother's wife'/; formally not possible 


FINGERNAIL (age? ga-sin) : gei-sin : gei- 
sik < *ge-sin; (*tin > tin) 

FIRE bee : bai : bee < *bai; *may > méi 

FIREWOOD /iN : hain : haen < *sjen, 
(*thin > thin) 

FISH /fii : fui] : [kahuan]; [*naa 5€ 
*hyaa > sà-nghá ] 

FLOW nye : nuai : rue < *nuai; *hnaay > 
hnái *pus, sap, juice, exudation' 

FLOWER a-bueN : hain-buan : ma-buaik < 
*buan; *paar > paar 

FOOD ma-lueN : ma-luan : ma-luaik < 
*mo-luan 

FROG ra? : ra? : raa < *ra? 

FRUIT fiN-wee : hsin-wai : roy-wee < 
*wai; *thay > théi 

FULL /jee : jei : ljee « *ljai 

FULL/SATIATED miy : mor : mor < *moy 

GARLIC (ALIUM HOOKERI) daN : dar : dak 


eg EN 

GHOST ma-iao : ma-hau (ma-lau) : ma-taa 
« *ma-taa; *khlaa > thláa ‘spirit, one's 
double, the spirit or soul of a man’ 

GIVE taN : tay : tay < *tay 

GREEN a-rjce : a-rjei : a-rjee < *a-rjai 

GUTS a-lyi-rin : a-hui-rin : a-lue-rig « *a- 
lui-rin; *ril 3<*rul > ril 

HAIR (ON BODY) a-min : a-man : a-muir < 
*a-mun; *mul x *hmul > baal 

HAIR (ON HEAD) ka-zaN : (ka-zjay) : ka-zak 
< *ka-zay; [*s'am > sam] 

HAND/ARM a-ge? : a-gei? : a-geik « *a-gat 
(Rawa got); *kut X *khut > küt; 
aspirated only in Northern Chin 
varieties 

HEAD a-kuN : a-kuy-baa : a-kok-baa < *a- 
kon 

HEART a-luN-baa : a-luyn-baa : a-lok-baa < 
*a-log-baa; *luy > lung 

HOLD IN MOUTH mom : ? : mom < *mom; 
*hmoom > hmáwm ; B mom means ‘to 
close the mouth" 
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HUSBAND a-wui : a-wui : a-wue < *a-wui : 
ILL/SICK naN : nay : ray < *nay; [*naa-l, 
*nat-II > náa-I, nát-II]; codas do not 

correspond to Kuki-Chin 

ITCH 20 : a-wua : a-wua < *a-wua; [*thak 
> thak] 

KILL /we?] : ai? : aik < *at (Rawa at); 
*that-I, *-tha?-II > that-I, thàh-II 

KNIFE (MACHETE) fii : fee : fee < *fee® 

KNOW deN : dan : daik < *dan; Mizo théi- 
I, théih-II ‘can, be able’ are false friends, 
neither rhymes nor onsets correspond 

LEAF a-lap : (hain-jap) : a-lak < *ljap 

LEECH /pa-] we? : [pa-] wai? : ka-waik < 
*ka-wat (Rawa pawat); *wat X wot 3€ 
wut > vàng vat; the prefixes p- and k- 
are both attested in TB, but only p- 
could be secondary in Puroik as 
common prefix for lower animals. 

LEFT SIDE pa-fii : pua-fii : pua-fee « *pua- 
fee? ; *? > véi (Weidert 1987) 

LEG a-lee : a-lai : a-lee < *lai; [*phay > 
phéi] 

LICK /ja? : jaa : vjaa < *?; *liak-I, *lia?-1I 
> liak-[, liah-IT 

LIGHT a-b : a-tua : a-tua < *a-tua 

LISTEN min : nun : ron < *noy 

LIVER a-pjiN : a-pjin : a-pjik < *a-pjin; 
*thin > thin; Khumi (Souther Plains 
Chin) pthueng and Lakher (Maraic) pa- 
thi 

LONG a-pjaN : a-paar : a-paar < *a-pjay 

LOUSE (HEAD) //i?] : [haè] : [pace] < *?; 
*hrik > hrik and/or *khra? > thra? 
(HL) ‘body louse’ 

MALE/FATHER d-poo : a-pua : a-pua < *a- 

pua; (*paa > pa) 

MAN a-fuu : a-foo : a-fuu < *a-fuu? 

MARROW (a-lyiN) : a-hin : a-lig < *a-lin; 

*khlik x *khlig > thling 

MEAT //ii] : [mai] : [marjek] « *?; (*shaa 

7 sá) 

MONKEY (MACAQUE) /marar] : [sadun] : 

[mazii] 

MOTHER see FEMALE 

MORTAR Satsam : ffuyffom : tfjugtfok < 

*ffug-ifam; *shum > sum 

MOUTH a-sam : a-sam : a-sak < *a-sam; 

[*kam > kam] 

MUSHROOM miy : may : may < *may 
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MUTE/STUPID blo? : blo? : blok < *blok 

NAME a-bjeN : a-buen : a-baey < *a-bjen; 

[*min 3*hmin > hmíng] 

NEAR a-nyi : a-nui : a-nui < *a-nui, 

*naay X *hnaay > hndi-I, hnaih-II 

NECK ka-tuN-rin : tuy-rin : ka-tuy < *ka- 

tun 

NEGATION ba- : ba- : ba- < *ba- 

NEGATIVE EXISTENTIAL COPULA see 

EXISTENTIAL COPULA 

NEW (OF THINGS) a-feN : a-fan : a-faik < 

*a-fan; (*thar > thar) 

NIGHT/DARK a-feN : a-ffen : a-ffik < *a- 

gen), [*thim > thim ‘be dark" 

NOSE a-pur, : a-puy : a-pok < *a-por 

OLD (OF THINGS) a-fseN : a-tfjen : a-faik < 
a-ffjan; (*tar > tar ‘be old or aged") 

PATH /im : lim : lik (Saria) « *lim; (*lam 
> lam); CT [puzue] 

PENIS a-lo? : a-lua? : a-lua < *a-lua? 

PERSON [prin] : bii : bii < *bii; *mii > mí 

PIG /wa?] : [dui] : [madou] < *?; *wok > 
vàwk 

PILLOW ka-kam : kon-kam : ko-kam < 
*koj-kom®; *kham X *khum > lu- 
khàm (FL); The first part lu- (lŭu) in 
FL also means ‘head’, like the first part 
in Puroik *kon-. 

PUROIK (prin-daa) : purun : puruik « 
*purun 

PULL mi : rui : rue < *rui 

QUIVER zap : zap : zak « *zap 

RIPE a-min : a-min : a-miy < *a-min; 
*hmin > hmín 

ROT fam : haam : hjap < *sjamO 

RUN rin : ren : rik « *rin 

SAGO FLOUR bii : bee-mo : bee < *bee); 

can be prepared in three ways with hot 

water, roasted directly in the charcoal, 

eaten as pancake, fermented as beer 


15. A progress report on the historical phonology and affiliation of Puroik 


SAGO CLUB (TOOL) waN : war : wak « 
*way; used to hammer the sago fibres in 
order to loosen the starch mechanically 

and wash out the sago flour later 

ERIT]. af — 


SAGO PICK (FRONT PART) kju? ` kau? : 
kuok < *kjok; used to cut sago fibres 
from the trunk in order to hammer them 
with the *war in the next step 


SCRATCH bju? : bau? : baoo < *bjo? 

SEW pin : pin : pig < *pin 

SHADE a-lim : a-him : a-lap < *a-lim(O; 
*hli(i)m > hlim 

SHELF (OVER FIREPLACE) rap : rap : rak 
< *rap; *rap > rap; only Central Chin 

SHOULDER pa-tiy : pua-tuy : pua-tok < 
*pua-ton 

SHY bii-weN : bii-wan : bii-waik < *bii- 
wan 

SIT [rà] : [dao] : [tug] 

SKIN a-ku? : a-ki? : a-kaa < *a-ku??; 
*khok-I, *kho?2-II > khok-I, kho?- 
II (HL) ‘peel off, strip’ 

SKY ha-min : may : ka-may < *ha/ka-may 

SLEEP ram : ram : ram < *ram 

SLEEPY ram-bin : ram-bin : rəm-biņ < 
*ram-bin 

SMELL nam : nam : nay < *nam; *nam > 
nam 

SMOKE be-kii : bai-kaa : bee-kii < *bai- 
kii; *may-khuu > méi khá 

SON-IN-LAW a-bo? : bua? : a-bua < *bua?; 
*maak > maak pa; Missing kinship 
prefix a- in KR is irregular and maybe 
archaic. 

STAND ffin : fin : fin < *fin; [*din-I, *din- 
II > ding-I, din-II] 


STAR [haNwai?] : [hadan] : [hagaik] 

STONE ka-liņ : ka-hun (ka-luy) : [kabsaa] 
< *ka-lug?; *luy > hin 

SUN hamii : hamii : krii < *PFX-nii; *nii > 
ni; hamii<*ham-nii (ham is the Western 
Puroik *sky"-prefix like in SKY, MOON, 
STAR, SNOW, RAIN), Arii< *ka-nii (ka is 
the Eastern Puroik “sky’’-prefix like in 
SKY, CLOUD, SNOW, LIGHTNING, also 
attested in Southern Plains Chin) 
Western Puroik *mn > m is ad-hoc but 
natural. 

SWEET a-pin : a-pin : a-piņ < *a-pin 

SWELL pan : pan : paik < *pan; (*puar > 
par ‘be bulging (as stomach) °); onset 
does not match 

TARO fja? ` fja? : fua < *ffua? 

TASTY/SAVORY (a-jim) : a-rjem : a-rjep < 
*a-rjem; *lim > lim? (Tiddim); 
Northern Chin only 

THAT fee : tai : tee < *tai 

THICK (BOOK) a-pan : a-pan : a-pik < *a- 
pan); Homonymous with SWELL? 

THIN (BOOK) a-5ap : (a-fjam) : a-ffap < 
*a-ifjam 

THIS hin : hay : hay < *hay 

TONGUE a-lyi : jui : (a-rue) < *a-lui®); 
(*lay > léi ) 

TOOTH ka-toN : tuan : ka-tuay < *ka-tuay 

THORN ma-zuN : ma-3uy : ka-zjoy < 
*ma/ka-zon 

UP kuN : kun : kun < *kuy 

URTICA FIBRES /aN : haan : haak < *sjan; 
used to make a bowstring 


t ^ A21: 

VOMIT mue? : muai : mue « *muai? 

WAR mo? : mua? : mua « *mua? 

WARM a-lam : a-lam : a-lap « *a-lom; 
Zum 3€ hlum > lum 

WATER £o : kua : kua < *kua 

WEAVE (ON LOOM) &?-r2? : ai-rua? : aik- 
rua < *at-rua?; [*tak-I, ta?-II > tah] 

WET a-fam : a-haam : a-hjap < *a-hjam” 
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WHAT hee : hai : [hii] WING a-3uiN : a-3un : a-juik « *a-jun 
WHITE a-rjuN : a-rjuy : a-rjuy < *a-rjug WOMAN /maruu] : a-mui : a-mui < *a-mui 
WIFE a-3uu : a-zjoo : a-zou < *a-zjoo”) WOOD see FIREWOOD 
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16. A methodology for studying Ahom manuscripts: 
How collocations help us! 


Poppy Gogoi 
Gauhati University 


The Tai Ahoms came to Assam in 1228 AD where they ruled for a period of more than 600 years. 
Although Tai Ahom was the state language, during this period the Ahoms gradually lost their own 
language and culture. However, the Ahom community still maintains a good store of manuscripts written 
in the Ahom script, some of which are more than 200 years old. In this paper, I will discuss the 
methodology that has enabled the systematic translation of the Ahom manuscripts. Part of this 
methodology involves an understanding of collocations. i.e. combinations of words that tend to occur 
together. It will be shown through examples that even though translations may appear grammatically 


correct, a knowledge of collocations help us decide on the most appropriate translation. 
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1. Introduction 


The Ahoms came to Assam from Mau Lung in 1228 AD across the Patkai range and 
established a kingdom in the Brahmaputra valley (Barua 1930). They ruled in this area for a 
period of more than 600 years. Their language, Tai Ahom belonged to the south western 
group of the Tai Kadai language family (Li Fang-Kuei 1977). It was an isolating 
monosyllabic language. We might also guess that it was tonal, given that the majority of Tai 
languages in the Tai Kadai family are tonal. 

At present, Tai Ahom is no longer spoken by people who identify as Ahom. The Ahoms 
now speak Assamese as their mother tongue and the few who know Tai Ahom use it for some 
ceremonies and religious practices only. One of the main reasons for the decline of the Ahom 
language was that from the beginning, when the Ahoms came to establish the kingdom in 
Assam, they were few in number and mostly male. After the establishment of the Ahom 
kingdom in the Assam valley, the Ahoms also did not remain in contact with other Tai 
communities. Consequently, the Ahoms intermarried with non-Tai speakers and gradually 
ceased to use their language, as well as continue traditional cultural practices (Morey 2014). 


II am very grateful to Dr. Stephen Morey for making me a part of the Ahom Manuscripts Archiving project 
funded by the Endangered Archives project of the British Library. It is through his immense support and 
encouragement that I am able to study Tai Ahom which is also my ancestral language. I am also grateful for the 
guidance that I received from Ajahn Chaichuen Khamdaengyodtai, who is expert in Tai languages from Chiang 
Mai, Thailand. He not only helped me in learning Tai Shan and understand its connection with Tai Ahom but 
also guided me in making translations of the manuscripts. I consider myself fortunate to be a student of the 
department of Linguistics of Gauhati University. I am thankful to Prof. Jyotiprakash Tamuli for providing us, the 
students of the department with a platform to meet with and learn from research scholars and experts visiting the 
department from different parts of the world. I would also like to thank my colleague Medini Madhab Mohan for 
sharing with me his knowledge of the manuscripts, Syed Iftiqar Rahman for his help and support throughout the 
project, Institute of Tai Studies and Reseach (ITSAR) and all the manuscript owners namely Junaram Sangbun 
Phukan, Tileshwar Mohan, Bidya Phukan, Dulen Phukan, Kesab Baruah, Hara Phukan whose manuscript 'Lik 
Chau Ngi' that I have been translating with Ajahn Chiachuen, and others for sharing with us their valuable 
manuscripts as well as their knowledge about them. 
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In contrast, other Tai communities like the Tai Phake, Aiton, Khamti, Khamyang, Turung etc. 
migrated to Assam at different periods in time after the Tai Ahoms and have managed to 
retain their own languages (Morey 2005). 

There are people in the Ahom community who are trying to revive the language but these 
people are not fluent speakers of the language. For instance, some speakers do not make tonal 
contrasts in their speech, while others who have learned Tai Phake or Tai Aiton use the Phake 
tones when they speak Ahom. 

The Ahoms did have a rich writing tradition in the form of manuscripts made from the 
barks of the sasi tree (Aquillaria agallocha). These manuscripts were copied and passed down 
from one generation to the next. The study of these Ahom manuscripts has been a subject of 
interest for scholars, both native and foreign since the beginning of the early 19" century. 
These manuscripts deal with a wide range of subjects ranging from astronomy, fortune telling, 
creation and origin myths, divination and rituals. They also include state chronicles and 
accounts of happenings and events during the Ahom period. The first translation of an Ahom 
manuscript was done in 1837 by Captain F. Jenkins with the help of Juggoram Khargoria 
Phukan and other members of the Deodhai priestly clan (Jenkins 1837). Although they 
correctly identified the content of the manuscript, which dealt with the primordial state of 
affairs and the creation of the world, they were unable to produce an accurate word-for-word 
translation of the text (Terweil 1989). This was probably because the translation was done at a 
time when Ahom as a spoken language was on the decline and the few people who knew it 
only had a superficial knowledge of the language. 

Following this first attempt, many other scholars tried to record and translate the Ahom 
manuscripts. The Ahom-Assamese-English Dictionary by Golap Chandra Baruah and the 
Ahom Lexicons by N.N Deodhai Phukan of the Department of Historical and Antiquarian 
Studies of Assam are two such endeavours to record and translate the meanings of words 
found in the Ahom manuscripts. However, the manuscripts cannot be readily translated with 
the help of these dictionaries, due to the idiosyncracies of the Ahom script and language, as 
well as errors in the dictionaries themselves (Terweil 1998). 


2. Methodology used for the translation of the Ahom manuscripts. 


The main aim of the project Documenting, Conserving and Archiving the Tai Ahom 
Manuscripts of Assam, funded by the Endangered Archives Programme of the British 
Library,” has been to preserve and archive high quality images of the Ahom manuscripts, so 
that they can be further studied in the future. This is an urgent matter given that the 
manuscripts are very old — some as old as 250 years — and will deteriorate over the course of 
time, especially in the wet and humid climate of Assam. There are a number of manuscript 
owners, including Bidya Phukan, Gileshwar Bailung, Tileshwar Mohan, who have taken very 
good care of the manuscripts that have been entrusted to them. Nevertheless, it is important to 
maintain digitized forms of these manuscripts as a backup. 

In order to begin the translation process, the very first step is to have very good quality 
photographs. For this project, the manuscripts are first photographed in raw format with a 
Canon EOS7D to produce accurate images of the manuscripts without any distortion. Image 
files are then made into TIFFs as per the requirements of the British library and into JPGs to 
be able to make copies of these manuscripts for the owners of the manuscripts and interested 


? I have been working on this project since October 2011 under the guidance of Dr. Stephen Morey. All the 
photographs taken as a part of the project (EAP373) will be archived and made accessible via the Endangered 
Archives Programme (http://eap.bl.uk/). It is also intended to make these manuscripts available at 
http://sealang.net. 
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people. The TIFF files are also large like the raw files so it is more convenient to make copies 
of JPGs to be distributed among people. 

Before taking the photographs, the manuscripts are first serially arranged by folio 
number.? Almost all the manuscripts have their folios numbered according to the Ahom 
numbering system, which is a mixed system for expressing the basic numerals. The authors 
and the scribes of the manuscripts numbered the folios using the Ahom numerals.^ These 
numerals can be divided into four types: 

(i) Ahom digits that are used only in Ahom and not in any other Tai varieties. These are 
the symbols for ‘1’, ‘7’and ‘8’, 

(ii) Letter digits that are based on Ahom letters. For example, the number ‘2’ is expressed 
by using a symbol similar to the letter kha, 

(ii) Derived digits that are symbols based on a combination of Ahom letters. For 
example, the number ‘6’ is based on na + the medial ra symbol, and ‘9’ is a combination of 
nga + medial ra symbol, 

(iv) Fully spelled out digits such as ‘3’, ‘4’ and ‘5’. (Morey 2012) 


relent T 
123 45 
2 


Figure 1 — Sample image of the verso side of folio number 2. 


The above image in Figure | is a presentation of the way the manuscripts were being 
photographed. The color chart is used to identify the original color of the manuscript at the 


3 By ‘folio’ we mean each sheet of the manuscript. Since each sheet is numbered only on the verso side, ‘folio 1’ 
refers to both the front and back side of the 1st sheet of a manuscript. 

^ The only manuscripts that do not have numbers on them are the fortune telling manuscripts Du Kai Seng. This 
is because the text written on each of the folios is complete and does not relate to folios preceding or following 
them. 
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time of it being photographed. Below the color chart is a scale to measure the size of the 
manuscript. 

Once the manuscripts have been photographed, a text document is made with lines for a 
word-by-word analysis of each sentence. The fields that were decided upon for the translation 
of the manuscripts are: (a) Ahom script; (b) Transliteration in Roman script; (c) phonemic 
representation; (d) English gloss; (e) Shan gloss. There is also a free translation line in 
English below. Note that it is planned for an additional Assamese gloss line to be added. 

Each line in the folios is entered in the Ahom script field as it appears in the original 
manuscript. Further, each line is broken down word by word as single units to have a word- 
by-word translation of the sentence. In this way, every single unit in the sentence is observed 
equally in relation to the context of the sentence. This prevents the common mistakes when 
one tries to get a general overview of the whole sentence on the basis of a few familiar words. 
Moreover, care is taken to ensure that the translated sentences relate to each other. 

This can be illustrated with an example from the translation of the manuscript Lik Chau 
Ngi — see example (1). This example is the third line on the verso side of folio number 2 
([2v3]) of the manuscript Lik Chau Ngi. 


0) a và VQ C wë Vat " 


b. khai A pO kong lung nit 
[2v3] 
c. khai pha? po kong lung nit 
d. king strike drum big hurry 
e. ko,Pv.  ebv. gzcr IUcr nQdr; // 
L "The King beat the big drum quickly.’ 


In this example, 

a. The Ahom text is typed in the first row, using the Ahom manuscript font. Each word in 
the text is separated by spaces. The final letter in each word always represents either a long 
vowel or a consonant marked by a virama, a symbol to indicate the final consonant. The two 
bars at the end of the sentence function like a full stop in English that marks the end of the 
sentence. 

b. The second row contains the transliteration of the Ahom words. The long vowels which 
always occur in the word final position are written in capital letters. The capital <A> is for a 
potential long /a:/ vowel and <a> is for the ‘short’ /a/ vowel,. However, there is no clear 
evidence yet of a vowel length distinction. The capital «O^ represents the low back rounded 
vowel /5/ — however, we do not have evidence that [o] and [o] were contrastive and so 
represent this with o in the phonemic line. Note that the folio number and the line number of 
the text are also included in this line. In the above example, the numbering ‘[2v3]’ means that 
this sentence occurs in the third line on the verso side of the second folio. A single line may 
also have more than two sentences, but only the word at the beginning of the line is 
numbered, i.e. «kong» marks the start of the line. 


5 Sometimes, khai pha ‘king’ is also written «khai A>, but it still has to be read and understood as khai pha. In 
G.C. Barua's dictionary, he translates khrai pha as ‘king’, but in the Ahom script he writes it as «khai A> (Barua 
1920). 

$ All orthographic consonants in Ahom represent a consonant plus an inherent /a/ vowel, unless written with the 
virama or followed by a long vowel. Thus, when the /a/ vowel occurs in word-medial position in a word like 
ban, the medial <a> is not written in between <b> and «n», the Ahom script writes it as «b n> followed by the 
virama. There might have been a vowel length distinction between ban and baan, but this difference is not 
represented orthographically. 


290 


16. A methodology for studying Ahom manuscripts 


c. This row is the phonemic line which is meant to help in the pronunciation of the words. 
While the previous transliteration line is to help in understanding the combination of the 
orthographic vowels and the consonants, this phonemic line is meant to show the correct 
pronunciations of the words. Note that in this line, the symbol v represents the back 
unrounded vowel /w/, which in the transliteration line is written with a combination of «i» 
and <u> (Morey 2005, Hosken and Morey 2012). 

d. The fourth row contains the English gloss. The content of this line is created by using 
Tai Shan as the intermediate language (see next line) and by consulting the online Ahom 
dictionary," which is based entirely on words found in manuscripts. 

e. The fifth row contains the Tai Shan counterparts of the Ahom words, written in the 
Shan script. Tai Shan is used as the intermediate language to help in deciphering the meanings 
of the Ahom words. It is the state language of the Shan state in Burma. Tai Shan is also 
spoken in other Tai speaking countries like Thailand, Laos, Vietnam etc. Tai Shan is used 
because Shan is similar to Ahom in that tones were not marked and the phonemic vowels 
were orthographically underspecified until its reform in the 20" century. The Tai Shan glosses 
were provided by Ajahn Chaichuen, a native speaker of Tai Shan with an extensive 
knowledge of Tai literature and both pre- and post-reformation Shan orthography. The 
meanings of these Shan translations were then translated to English using an online Shan 
dictionary. It is through this process that some important differences in the phoneme 
inventories of Tai Shan and Tai Ahom could be discovered. For instance, Tai Shan has the 
vowel sounds (/1 e €/) whereas in Ahom all these vowels are represented orthographically with 
<i>. It appears that the Ahom vowels /i/ and /e/ merged by the end of the Ahom kingdom 
(Morey 2005). 

f. There is a free translation line below the others which gives the whole meaning of the 
sentence in English. It is our intention to eventually add Assamese free translations as well. 


3. Difficulties in translating Ahom manuscripts. 


Some of the major difficulties we face in the translation of manuscripts are as follows. Firstly, 
Ahom, like the other Tai languages, was most likely a tonal language. However, since the 
tones are not marked orthographically, it is very difficult to make correct translations of 
words. Thus, one has to choose from a list of the different meanings a single word may have 
and the context in which it occurs. 

The Ahoms had the tradition of copying the manuscripts in order to preserve the old texts. 
However, since the Ahoms stopped speaking Ahom towards the end of their reign, the scribes 
thus only had a partial knowledge of the language. At times, this resulted in mistakes during 
the copying of the manuscripts. The influence of Assamese sometimes led to incorrect 
interpretations of some Ahom words. Thus, it is good if we get more than one copy of the 
manuscript we are translating so that we can compare them and find out if there are any 
spelling mistakes or missing words which may lead to wrong translations. 

Finally, sometimes even if the translation of an individual sentence makes sense, it may 
not relate to the wider context of the manuscript. Hence, for translation of these manuscripts 
we have formulated a systematic methodology, using our knowledge of word collocations and 
grammatical parts of speech. 

4. How can collocations help in the study of manuscripts? 


7 The online Tai Ahom Dictionary is based on work by Stephen Morey, Chaichuen Khamdaengyodtai and 
Zeenat Tabassum, with the assistance of many Ahom pundits. It is accessible online through the SEALANG 
library at http://sealang.net/ahom and maintained by the Centre for Research in Computational Linguistics, 
Bangkok. 
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Collocations are combinations of words that often tend to occur together. In the translation of 
manuscripts, grammatical features of the different word classes in the language help in 
forming a syntactically correct sentence structure. We also know that the word order in Ahom 
is SVO. Some other common grammatical features are highlighted here: adjectives in Ahom 
always occur after the noun they modify; the negative occurs before the verb; and verb 
serialization is very commonly seen in the manuscripts. Sometimes even if a sentence appears 
to be grammatically correct, it may not make much sense if the available collocational 
preferences are overlooked. These features help in the proper translation process of any 
language, and Ahom is no exception. 

It has to be kept in mind that tones are not represented orthographically in Ahom and that 
one orthographic word may have many meanings, arriving at an appropriate translation of the 
texts requires even more contextual information than one might require for other kinds of 
translation. A sound knowledge of Ahom language and culture and its history is thus 
imperative. For instance, the Ahom calendar is based on a 60 year Lak Ni cycle, according to 
which months and years are named. While deciphering a fortune-telling manuscript, one 
frequently encounters references to the names for these the months and years. A knowledge of 
the Lak Ni cycle will therefore prevent confusion with other possible interpretations of these 
names. 

The following example (2) from the translation of the manuscript Lik Chaw Ngi illustrates 
how a knowledge of collocations can help in the translation of manuscripts. 


Qj A r wá w E né u 
khai phA kU p(ak ` Wan phiung khun 
khaipha ku pak lat phvng ` khun 
king to open mouth tell group prince 
ko,Pv. gU, bagr, ladr; pucr unr  // 


‘The king said to the princes.’ 


Recall that Ahom is a monosyllabic language where a word is the length of a syllable. 
Since tones are not represented in the orthography, the orthographic form of a word may have 
more than six different possible meanings. This makes it difficult to arrive at appropriate 
meanings of words in translating them. For example, let us consider the different possible 
meanings we get from each of the words in the above sentence, as shown in Table 1. 


Table 1: Set of possible meanings for each word in an example sentence 


1 2 3 4 5 6 7 

khai pha ku pak lat phvng khun 

to vomit | king pair ash gourd | to say paddy prince 
straw 

sick knife perfume | mouth castration | bee hair 

to leave | sky/heaven | bed rule distribute | duck weed 

egg stone to fear | separate collect mix 

buffalo | wall to sing | plant rush to move 

rapidly 
to sell to split to open | strike group 
move cliff all adorn 


Now from this table, we can try to arrive at a suitable meaning of the sentence by 
eliminating what are likely to be incorrect glosses for the words. 


292 


16. A methodology for studying Ahom manuscripts 


To begin with, if we look at the first two words khai pha, we get seven different meanings 
for each of the two words, or 49 (7x7) permutations. However, all these words are either 
nouns or verbs. If we take the first word to be a noun then in Ahom it is more likely that the 
following word will be a verb. If we take the first word to be a verb, then it is equally likely 
that the next word will be either a noun or a verb. The next step is to look at the possible 
semantic interpretations of the two words together. This gives us the following 14 
possibilities: 


1) The king vomit 

2) The king sick 

3) The king leave 

4) The king sell 

5) The king move 

6) Egg of the sky 

7) Leave the sky/heaven 
8) Leave the cliff 

9) Buffalo of the heaven 
10) Buffalo of the cliff 
11) To sell the knife 

12) To move the stone 
13) To move the knife 
14) Move of the king 


The combination of khai and pha is commonly found in Ahom texts. The interpretation of 
khai pha as the king is also found in the Ahom Lexicons based on original Tai manuscripts 
edited by B. Barua and N.N Deodhai Phukan. Furthermore, it is seen in several manuscripts 
that the world khai and pha often occur together and always refer to ‘a/the king’. Here, the 
literal translation ‘egg’ refers the seed or cell that is capable of developing into a new 
individual. And when the two words pha occur together, the resulting expression refers to ‘the 
cell in the sky or heaven that is capable of developing into a new individual’. So, there is a 
very high probability that the first two words khai pha means ‘king’. Collocations relating to 
‘king’ are further discussed with examples below. 

If we look at column (5), we find that /at has got only two possible meanings ‘to say’, 
which is a verb, and ‘castration’, which is a noun. Here, we can choose from the two possible 
meanings on the basis of the translations of the previous lines where we find that a king has 
set out on a journey to conquer different kingdoms. Thus, in this context we may leave out 
*castratation' as a possible interpretation. Consequently, if we accept that lat translates as the 
verb ‘to say’, we see that it now relates to the noun ‘mouth’ in column (4), which further 
relates to the verb *to open' in column (3). So we now have the translation: *The king opened 
his mouth to say’. 

Finally, looking at the possible word translations for phvng and khun in columns (6) and 
(7) respectively, we end up with two probable meanings of the sentence: 


(a) The king opened his mouth to say to the group of princes. 
(b) The king opened his mouth to say to move quickly. 


From these two options, translation (a) is better because the word khun is used many times 
in the manuscript to mean ‘prince’. Therefore, the reading of phvng khun as a ‘group of 
princes’ seems more likely in this context. This reading is confirmed once we work out the 
larger context of the story — the text is about a brave warrior named Chau Ngi who is on a 
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journey to conquer different kingdoms of the sky. During this journey, he is leading a big 
group consisting of many princes. 


4.] Examples of different collocations used to refer to *king" 


It has been observed that in the Ahom manuscripts, there are various different collocations 
that can be translated as ‘king’. These collocations often associate the king with the sky, with 
gold, or refer to the king as an egg, or the seed of the ‘pure Tai race’. These collocations 
referring to the king will be further discussed below. 


3) M vot mri Y n dC Wi wu 
khaai khaM  ch(a)ng t(a)k kU khU  miung koi lai 
khai kham chang tak ku khu | mvng koi laai 
egg gold then FUT withdraw army country only much 
ko,kmr: ` jer, dgr: gU; kU mer gzo: lao 
WI: x 


"Then the king withdraws the large army from the country.’ 


For example, in (3), the words khai ‘egg’, and kham ‘gold’ always refer to ‘king’ when 
they occur together, instead of meaning ‘golden egg’. The word khai also means ‘ovum’, 
‘seed’, as well as ‘cell’ — a seed or cell that is capable of developing into a new individual. 
Thus, when the two words khai kham occur together, the resulting expression refers to the cell 
in the sky or heaven that is capable of developing into a new individual. In this example, it 
translates as ‘the pure golden seed from the sky or heaven’ — in other words, ‘the pure lineage 
of the king'. Note that gold is used as a metaphor to show respect to the king, and that 
anything that concerns royalty is considered ‘golden’. 

Similarly, as we previously saw in (2), the collocation of khai ‘egg’, and pha ‘sky’ always 


means ‘a/the king’, instead of meaning ‘egg of the sky’. Another example of this is given 
below in (4). 


à vay ws wë n WO w. 90 dé " 


khaiphA tiuw khiung ` khU mok jA rim niw 

khai pha ty khvng ` khu mok ja rim niu 

king use thing property flower edge attach 

ko,Pv. diuwr: ker kUwr:  mzgryv; himr: nqwr, // 


[111 


(You), the king use these things and property, and the flowers attached at the edge." 


4.2 Other idiomatic expressions 


Apart from collocations of two to three words with specific meanings, there are also entire 
phrases in the text that can have idiomatic meanings. For example, in (6), we find an entire 


phrase that literally means ‘between day and night’, but is used to mean ‘to do something day 
and night’. 
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16. A methodology for studying Ahom manuscripts 


ei aé d d vse we w% C v6 
ch(aw ngl paai ` khiw b(an  khiw khiun b(a)w 
chaw ngi pai khiw ban khiw khvn baw 
RESP? Yi go go day gobetween night NEG 
between 
jw; yI; bo kqwr wnr: kqwr kuin mwr, 
Į: 
WU U 
IU [2r6] 
lu 
destroy 
IU. // 


‘King Yi is rushing day and night without stopping.’ 


Here, the phrase khiw ban khiw khvn together mean ‘day and night without stopping’. The 
word khiw can mean ‘green’ or ‘do quickly’ but it can also mean ‘go between’. It is this last 
meaning that seems to be appropriate here, combined with wan ‘day’ (also ‘sun’) and khvn 
‘night’. 

Thus, when the words khiw ban khiw khvn occur together, the expression in the context of 
this sentence means ‘to go from one part of the country to another without stopping’. The 
word khiw indicates ‘to go between’ and the words for ‘day and night’ mean ‘to do something 
without stopping’. The expression has now become a Tai idiom, referring back to the old Tai 
tradition of sending messages via parrots, which are believed to fly without stopping’. 


5. Conclusion 


It can be observed from the above discussion that translation of Ahom manuscripts is not easy 
and that only a systematic methodological process can help in making correct translations. 
The incorrect interpretation of a single word may change the meaning of the whole sentence 
or hamper in arriving at any conclusion. 

The manuscripts are the only sources that provide a lot of information regarding the 
language and culture of the Ahoms. Hence, correct translation of these manuscripts is 
necessary to know more about the Ahoms. However, many of these manuscripts are in very 
poor condition and may perish very soon. Thus, these manuscripts also need to be preserved, 
as the loss of these manuscripts will also mean loss of the rich language and cultural tradition 
of the Ahoms. 


8 Tn this example, chaw ‘resp’ is an honorific particle that is used when referring to a king or to people with some 
respected position. 

? The phrase khiw ban khiw khvn is also found in the Shan dictionary which means ‘non-stop’. The old Tai 
tradition of sending parrots as messengers and their relation to this phrase was pointed out by Ajahn Chaichuen. 
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Abbreviations 


FUT future 
NEG negative 
RESP honorific particle 
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